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PERSISTENCE OF THE 


RESISTANCE 
INDUCED BY VARIOUS TYPES 


TO PERSUASION 
OF PRIOR 


BELIEF DEFENSES’ 


WILLIAM J. McGUIRE 


Department of Social Psychology, Columbia University 


A number of previous studies have tested 
the relative efficacy of various types of prior 
defenses in making a person’s belief resistant 
to change when he is later confronted with 
massive counterarguments against the belief. 
None of these previous studies were designed 
to measure the effect on resistance of varying 
the time interval between the defense and 
the attack. By systematically varying this in- 
terval, the present experiment investigates the 
relative persistence of the immunity of per- 
suasion conferred by the different types of 
prior defenses. It is of some interest to know 
for each type of defense the rate at which its 
conferred immunity decays over time. Of 
even greater theoretical interest are compari- 
sons among the decay rates for the different 
types of defense. 


The predictions regarding these differential 
decay rates derive from the same postulates 
as gave rise to the earlier predictions, tested 
and confirmed in previous experiments, re- 
garding the relative immunizing effectiveness 
of various defenses without regard to the 


time interval between defense and attack. 
Hence, it is useful to mention several of the 
relevant previous findings and their theoreti- 
cal bases. The previous studies, and the pres- 
ent one as well, used cultural truisms as the 
beliefs being defended and attacked—for ex- 
ample, the belief that “We should brush our 
teeth after every meal if at all possible.” It 
had been postulated that there is little be- 
lief-dissonant information available regarding 
such cultural truisms in the person’s normal 
ideological environment. This unavailability, 
combined with the characteristic tendency to 
avoid even such belief-dissonant material as is 
available, would have left the person under- 
estimating the vulnerability of his belief and, 


1 This study was supported, in part, by a grant 
from the National Science Foundation, Division of 


Social Sciences. 


hence, unmotivated to acquire bolstering ma- 
terial and unprepared to deal with strong 
counterarguments when he is forced to expose 
himself to them. 

From this theoretical analysis follow sev- 
eral of the previously confirmed hypotheses 
regarding immunization against persuasion 
which are relevant to the hypotheses being 
tested in the present experiment. One of these 
previous findings (McGuire & Papageorgis, 
1961) is that prior refutational defenses are 
superior to prior supportive defenses in mak- 
ing cultural truisms resistant to subsequent 
persuasion. Refutational defenses involve 
mention and refutation of possible counter- 
arguments against the belief, while ignoring 
arguments positively supporting the belief. 
Supportive defenses do mention and elabo- 
rate arguments positively supporting the be- 
lief, while ignoring possible counterarguments 
against it. This superiority of the refutational 
defense would follow from the above theo 
retical assumptions, since the refutational de- 
fense contains a threatening element—men- 
tion of the counterarguments to whose exist- 
ence the subject has probably given little, if 
any thought—which stimulates him to bolster 
his belief. The supportive defense of the tru- 
ism, on the other hand, seems to labor the 
obvious, so that the subject is little motivated 
to assimilate the positive arguments and is 
left, if anything, even less motivated to bolster 
further the belief he regards as obvious. 

It was also demonstrated (Papageorgis & 
McGuire, 1961) that the refutational defense 
confers resistance to subsequent attacks even 
by novel counterarguments, different from 
those explicitly refuted in the defense. This 
conferral of generalized immunity by the 
refutational defense also follows from the 
theoretical assumptions. The immunizing effi- 
cacy derives not only from weakening the 
credibility of the specific counterarguments 
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refuted, but also from the threat induced 
stimulation to bolster one’s defense. Hence, 
the refutational defense increases resistance 
even to attacks by counterarguments other 
than those refuted. 

The foregoing theoretical interpretation of 
the previous findings gives rise to three pre- 
dictions regarding the temporal persistence of 
the resistance conferred by the different types 
of prior defense. First, it is hypothesized that 
the supportive defense will not only be ini- 
tially inferior to the refutational in the 
amount of resistance it confers, but in addi- 
tion that such resistance, as it does confer, 
will decay more rapidly than that conferred 
by the refutational defense. This prediction 
follows from the above interpretation that 
the immunizing efficacy of the supportive de- 
fense derives solely from the acquaintance 
with the positive arguments which it contains 
and which tend to be forgotten over time; 
the efficacy of the refutational defenses, on 
the other hand, derives in part from the 
threat induced motivation to bolster one’s 
defenses. Since for some time the subject will 
continue to act on the motivation, the for- 
getting of the refutational material will be 
partly offset by this continued acquisition of 
bolstering material. 

The second hypothesis is that the temporal 
decay of conferred immunity occurs more 
rapidly against attacks by the same counter- 
arguments as had been explicitly refuted than 
against attacks by novel counterarguments. 
The theoretical basis for this prediction is 
quite similar to that yielding the first hy- 
pothesis. The immunity to attacks by the 
very counterarguments refuted derives from 
both recall of the specific refutations, which 
decays over time, and the amount of bolster- 
ing material the subject has acquired on the 
basis of his induced motivation, which in- 
creases over time. The immunity to attacks 
by novel counterarguments derives solely from 
the latter mechanism. Hence, conferred re- 
sistance to novel counterarguments should 
tend to catch up over time with resistance to 
the very counterarguments refuted. 

The third hypothesis, which follows as a 
corollary of the second, is that the refuta- 
tional defense has a delayed-action effect in 
conferring resistance to attacks by novel 
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counterarguments. The refutations per se 
should not confer any resistance in this case 
at least, not in so far as the counterargu 
ments used in the attack are 
Hence, any resistance conferred derives from 
the second mechanism, the motivation to 
bolster one’s belief induced by exposure to 
the threatening counterarguments during the 
defense. Acting on this motivation requires 
time, particularly in the monolithic ideologi- 
cal environment that tends to surround cul- 
tural truisms. Hence, the resistance to attack 
by novel arguments will continue to grow for 
sometime after the threatening pre-exposure 
As time passes, the induced motivation will, 
of course, itself decay so that the total time 
function will be nonmonotonic. But for a 
time at least the conferred immunity will 


indeed novel 


grow. 

To test these temporal-trend predictions 
adequately it is important that we have some 
idea of the time parameters involved, It was 
in part to explore these parameters that the 
time interval between defense and attack was 
deliberately varied, from experiment to ex- 
periment, in the previous studies in this se 
ries. For example, in McGuire (196la), the 
attack came immediately after the defense; in 
McGuire and Papageorgis (1961), 2 days in- 
tervened; and in Papageorgis and McGuire 
(1961), the interval was 1 week. Hence, it is 
possible to make a crude test of the three 
temporal-trend hypotheses by cross-experi- 
mental comparisons. The results based on 


such cross-experimental comparisons are de- 
picted in Figure la and can be seen to be 
in accord with each of the three hypotheses. 
This confirmation of the predictions cannot 


be regarded as definitive since extraneous 
conditions varied somewhat from experiment 
to experiment. For example, the issues, de 
fensive and attacking messages, and types of 
subjects all differed somewhat among the ex- 
periments. The confirmations are sufficiently 
clear, however, that we were encouraged to 
vary systematically the time intervals within 
the present experimental design over com- 
parable magnitudes. 


METHOD 


Procedure. The study was represented to the sub- 
jects as an investigation of personality correlates of 





ey 


RESISTANCE TO PERSUASION 


verbal skills, a deception that was bolstered by sev- 
eral tasks the subjects were called upon to perform 
during the experimental sessions. Each of the 160 
subjects took part in two experimental sessions. Dur- 
ing the first they received 600-word mimeographed 
messages defending their initial beliefs on medical 
truisms such as “Everyone should visit his doctor 
at least once a year for a routine physical check up.” 
The subject was told that he 
his ability to analyze such technical passages and he 
was given 4 minutes to read and, in each paragraph, 
underline the shortest clause that summarized the 
main point being made in the paragraph. This un 
reful 


would be scored on 


derlining task was introduced to encourage ¢ 
reading and to disguise the persuasive purpose of 
the messages. He was then given various personality 
tests, not relevant to the hypotheses under discus 
sion, to disguise further the persuasive intent of the 
study 

The second session came either 2 days (for 80 sub- 
later. In the 
second session, each subject received further defen- 
additional medical truisms and 
then, within the same booklet, additional messages 
attacking the defended truisms and, in 


control conditions described under Design, previously 


jects) or 7 days (for the other & 
sive messages on 
previously 


undefended truisms as well. As in the first session, 
the subject was given 4 minutes to read and under- 
these defensive 


ques 


clauses in each of 


Another personality 


line the crucial 
and attacking 
tionnaire was then administered and then the sub 


messages 

ject was asked to fill out an opinion questionnaire 
indicating his own beliefs on the medical issues dealt 
with in the messages, on the pretext that we wished 
to check on whether the subject’s personal opinions 
on the topics discussed in the passages affected his 
ability to read these passages analytically. The 
jects then filled out a questionnaire designed to 
ascertain the extent to which the desired experi- 
mental conditions obtained,* after which the true 


2 This Critique of the Experiment final question 
naire was designed to measure the adequacy of the 
much the subject had heard 
of the experiment in advance and whether he sus- 
pected its persuasive intent. About 20% of the sub 
jects complained that some section of the test had 
time 


sub- 


time allowances, how 


} 


been given either a too long or a too shor 
allowance; more than half the complaints were that 
the time allowed for the 
was too short. The subjects 
through this section to keep the session down to 50 
4 surprising number admitted having heard 
Jespite 
from 


; 


noncrucial personality test 
were indeed rushed 
minutes 
something about the experiment in advance, 


our request to the subjects that they refrain 


with anyone until the end of the ex- 
Hearing that the test involved a 


dealt with 


discussing it 
perimental period 
reading comprehension test or that it 
medical topics was admitted by 31 out of the 160 
subjects. In addition, 4 heard that one’s opinions 
were measured. When called upon to suggest what 
les verbal skills—the experiment could have 
measured, only 5 suggested any purpose having to 
do with opinion change, persuasion, or propaganda 


else—bes 
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nature of the experiment and the nature of and rea- 
sons for the deceptions were explained to the subject 

Defensive and attacking messages. Two types of 
defensive messages were employed. The “supportive” 
defense had an introductory paragraph mentioning 
that the truism in question was obviously valid but 
that it was wise to consider some of the reasons why 
Two arguments in support of 
There followed two 


it was indeed valid 
the belief were then mentioned 
paragraphs each developing in a calm, factual way 
one of the two supportive arguments. These sup 
portive messages avoided any mention of possible 
counterarguments against the truism 

The “refutational” defenses began with a similar 
introductory paragraph mentioning that the truism 
was obviously valid but that, since occasionally one 
heard misguided counterarguments attacking it, it 
was wise to consider some of these counterargu- 
ments and show wherein they erred. Two counter 
arguments against the truism were then mentioned 
The following two paragraphs each refuted in a 
calm, factual way one of these counterarguments 
These refutational messages avoided mentir f argu- 
ments directly supporting the truism—t’ merely 
refuted counterarguments against it 

The attacking messages were similar in format to 
the defensive, each being about 600 words in length 
The introductory 
would be sur- 


and divided into three paragraphs 
paragraph stated that most laymen 
prised to learn that advanced medical and scientific 
work was beginning to cast some doubt on the va- 
lidity of the truism in question and, hence, it would 
be wise to ponder some of these recently discovered 
counterarguments against the belief, two of which 
were then mentioned. Each of the following two 
paragraphs expounded in a calm, factual manner the 
validity of one of these counterarguments. When the 
attacking message followed a refutation defense, one 
Half of the subjects 
very counterargu- 
of the subjects re 
counterarguments, 


of two alternatives was used 
received attacks employing the 
ments refuted; the other half 
ceived attacks employing novel 
different from those previously refuted 

Since the experimental design called for each sub- 
ject’s serving in four different defensive conditions, 
it was necessary to prepare supportive defense, refu- 
tational defense, and attacking messages on four dif- 
ferent truisms. Furthermore, since half the refuta 
tional defenses had to be followed by attacks em- 
ploying the same counterarguments as refuted and 
half by novel counterarguments, it was necessary to 
prepare two alternative forms of the refutational de- 
fense message and of the attacking messages on each 
issue, each dealing with a different pair of counter 
arguments. For symmetry of design, we prepared al 
ternative forms of the supportive defense on each 
issue, each form using a different pair of supportive 
Hence, 24 messages in all were employed 


5 


6 on each of four issues.* 


arguments 
in the present study 


All 4 of the 
used in this study 
American Documentation Institute 


defensive and attacking messages 
have been deposited with the 
Order Document 
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Opinion questionnaire. Beliefs on the four issues 
were measured by a 17-item opinion questionnaire. 
Each item consisted of an assertion on one of the 
issues (e.g., “Everyone should brush his teeth after 
every meal if at all possible.”) followed by a graphic 
scale containing 15 numbered categories with Defi- 
nitely False at one end and Definitely True at the 
other. The subject was told to make an “X” in 
whichever of the categories best indicated his own 
agreement with the statement. There were four items 
on each issue. The scores cited in the Results sec-~ 
tion and in Table 1 are based on the mean of the 
four items on each issue, with the possible range 
going from 1.00, for complete rejection of the tru- 
ism, to 15.00 for complete agreement therewith. One 
of the 17 items was a repeat to serve as a reliability 
check. The two responses to this repeated item 
yielded an intrasubject correlation of .82 

Experimental design. The design included four 
blocks of subjects. The subjects in the first block 
received refutational defenses on all four issues. 
They received such defenses on two issues in a first 
session 2 days previous to the attack; and the de- 
fenses on the other two issues at the second session 
immediately before the attack. In each session the 
defense on one issue involved refutations of the same 
counterarguments as would be used in the attack, 
and on the other issue, the refutation of alternative 
counterarguments to those that would be used in 
the attack. The subjects in the second block re- 
ceived the same treatments as those in the first 
block, except that for them the first session preceded 
the attacks by 1 week rather than 2 days. 

The subjects in the third block received defenses 
on only two issues. Both of these were supportive 
defenses, one coming in the first session 2 days be- 
fore the attack and one in the second session just 
before the attack. As regards the other two unde- 
fended issues, one was attacked in the second session 
to ascertain the impact of the attacks in the absence 
of any prior defense and one was not attacked to 
obtain an estimate of the initial belief levels in the 
absence of both defense and attack. The subjects in 
the fourth block received the same treatments as 
those in the third, except that for them the first ses- 
sion preceded the attacks by 1 week rather than 2 
days. 

Since there were four issues and two alternative 
sets of materials on each issue, eight subconditions 
were necessary in each block to allow the materials 
to be systematically rotated around the four treat- 
ment conditions. Five subjects served in each of these 





No. 7058 from ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Congress; Wash- 
ington 25, D. C., remitting in advance $2.25 for 
microfilm or $5.00 for photocopies. Make checks 


payable to: Chief, Photoduplication Service, Li- 
brary of Congress. 

*This opinion questionnaire has been deposited 
with ADI and can be obtained by writing for the 


document mentioned in Footnote 3, 
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eight “materials” subconditions, so that 40 subjects 
served in each of the four blocks.5 

The purpose of this rather complex design was to 
allow more sensitive tests of the theoretically rele- 
vant and likely-to-be-small treatment effects. Thus, 
all comparisons between refutational defenses, for 
example, those predicted in the second and third 
hypotheses, involve intrasubject analyses. Likewise, 
comparisons between the supportive defense (which 
usually has the smallest effect—see for example, Mc- 
Guire & Papageorgis, 1961) and no defense condi- 
tions also involve sensitive intrasubject analyses. It 
is true that comparisons between the refutation de- 
fenses on the one hand and the supportive defense, 
or the no defense control conditions involve across- 
subject comparisons, but such variations have been 
demonstrated in previous studies to produce sizable 
differentials. 

The complexity of the design did necessitate the 
computation of several different error terms to evalu- 
ate the differential effects presented in the Results 
section below. In general, the error terms are based 
on the residual variance in the conditions being com- 
pared. The individual differences variance was re 
moved when the comparison involved repeated meas- 
ures on the same subject, for example, the effects of 
refutation of same vs. alternative counterarguments; 
or an interaction effect between type of defense and 
time of attack; or the effect of a supportive defense 
vs. no defense. When the comparison was between a 
refutational defense and a supportive or a no defense 
condition, the large between-subject residual vari- 
ance, including individual differences among the sub- 
jects, was used as the error term 

Subjects. All 160 subjects were selected from a 
pool of students enrolled in the introductory psy- 
chology course at the University of Illinois. Among 
those who indicated at the beginning of the semester 
that they were regularly available on the days and 
hours chosen for running the experiment, the selec- 
tion was random. About 75% of the 220 subjects 
requested to appear actually participated in the ex- 
periment and the data reported below are based on 
the first 160 of those who appeared for both ses 
sions. The majority were sophomores and about 55% 
were females. 


RESULTS AND DISCUSSION 


General effects. The two control conditions 
set the probable limits within which the dif- 
ferential immunization effects can take place. 
The mean belief level in the neither-attack- 
nor-defense control condition is 11.74 on the 
15-point scale, and can be taken as an esti- 
mate of the initial belief level on the four 
truisms. Actually the estimate is probably 
conservative, since it is based on the indi- 

5 The design of the present study is described in 
detail in Table A of the ADI document mentioned 
in Footnote 3. 





RESISTANCE TO PERSUASION 


TABLE 1 


PERSISTENCE OF THE RES'STANCE TO PERSUASION CONFERRED BY 
Turee Types oF Priok Bevtrer DEFENSE 


Type of defense which preceded the attack 


ipportive 


arguments 


ation of counterargu 
s used in the attacks 


11.36 
(80) 


9.71 
(80)* 


Immediate 


8.51 
(40) 


11.08 
(40 


Two days 


8.82 
(40 


9.49 
(40 


Seven days 


n the cells are 
parentheses indicate the number of i 


Note Scores 
® Numbers in 


cated belief on an issue unmentioned in the 
messages, but taken after the receipt of mes- 
sages strongly attacking three other truisms 
and, hence, may reflect a general wariness on 
the part of the subject (see McGuire & Papa- 
georgis, 1961, and Papageorgis & McGuire, 
1961, for data on the accuracy with which 
such postcommunication beliefs on control, 
unmentioned issues estimate the initial level 
of the beliefs). The mean belief level in the 
other control condition (attack-only) is 8.49, 
indicating that in the absence of any defense, 
the attacks were effective in reducing the be- 
liefs 3.25 points on the 15-point scale (p 
< .001). The overall belief level in all three 
defense-and-attack conditions at all three 
time intervals is 10.17, which is almost ex- 
actly midway between the neither-attack-nor- 
defense and the attack-only means and sig- 
nificantly (p< .01) different from either. 
Furthermore, the means in all nine defense- 
and-attack conditions (see Table 1) do lie 
between the two means of the neither-attack- 
nor-defense and attack-only conditions. 
Relative persistence after supportive and 
refutational defenses. The supportive defenses 
conferred less resistance to the attack than 
did the refutational defenses regardless of 
whether the attack came immediately, 2 days, 
or 1 week after the attack. When the attack 
followed immediately, the superiority of the 
combined refutational defense conditions to 
the supportive was significant at the .01 level, 
but this superiority was primarily due to the 
conditions in which the very counterargu- 
ments used in the attack were refuted: the 


final belief levels on the Truis 


lividual sc 


Neither 
attack nor 
defense 


Attack 
without prior 


Refutation of alternative defense 


counterarguments 


10.41 
(80) 


11.45 
(40) 


9.68 
(40) 


s as measured on a 15-point scale 


ores on which the cell mean is based 


superiority over the supportive reached only 
the .20 level of significance when counter- 
arguments alternative to those used in the im- 
mediate attack were refuted. 

Where 2 days intervened between the at- 
tack and defense, this superiority of the com- 
bined refutational to the supportive defense 
became even more pronounced (p < .001) 
Whereas the immediate resistance conferred 
by the supportive defense had decayed (p 
< .05) almost completely after the 2-day in- 
terval, that conferred by the refutational de- 
fense actually showed a trivial gain from the 
immediate to the 2-day interval. As can be 
seen in Table 2, this gain yielded an F of 
only 1.05. This interaction effect between the 
supportive vs. refutation type-of-defense vari- 
able and the immediate vs. 2-day interval 
variable is significant at the .01 level and 
confirms the first hypothesis. It will be noted 
that this interaction effect is in the opposite 
direction to that to be expected on the basis 
of a simple regression effect: the resistance 
conferred by the supportive defense is not 
only less to immediate attacks, but such as 
it is, it also decays more rapidly than the 
greater immediate resistance conferred by the 
refutation. 

There is an alternative theoretical inter- 
pretation of this superior persistence of the 
immunity conferred by the refutational de- 
fense, an explanation related to the “sleeper 
effect” described by Hovland and his col- 
leagues (Hovland, Lumsdaine, & Sheffield, 
1949, Ch. 7; Hovland & Weiss, 1951; Kel- 
man & Hovland, 1953). According to these 
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theorists, if a persuasive message is initially 
accompanied by a discounting cue, its opin- 
ion change impact might actually increase 
with time passage, or at least decline rela- 
tively slowly (Weiss, 1953) as compared with 
a message not so accompanied. The recall of 
the persuasive content does, of course, decay 
over time but so does the recall of the dis- 
counting cue, thus, reducing or even revers- 
ing the net decay of induced opinion change. 
In the present situation the refutational de- 
fense could be interpreted as containing a 
discounting cue, namely, mention of the re- 
futed counterarguments, the forgetting of 
which dampens the decay of the initially in- 
duced opinion change. The supportive defense 
contains no such incidental discounting cue, 
so that its greatest impact should be felt im- 
mediately and decay thereafter without any 
mitigating effect of a simultaneously decay- 
ing discounting factor. Some credence is given 
to this interpretation by the results of an 
earlier study (McGuire & Papageorgis, 1961 ) 
indicating that the direct strengthening effect 
prior to any attack was somewhat greater 
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(.05 < p < .10) with the supportive than the 
refutational defense, even though the latter 
conferred more resistance (p < .01) to an at- 
tack 2 days later. 

Relative resistance after refutation of same 
and of alternate counterarguments. As can be 
seen in Figure 1b, the resistance conferred by 
the refutational-same the de- 
fense involving prior refutation of the very 
counterarguments to be used in the attack 


be- 


defense, i.e 


declines monotonically as the interval 
tween defense and attack increases. The de- 
cline from the immediate to the 2-day inter- 
val is not significant but that from 2 days to 
1 week is significant on the .01 level. A quite 
different, nonmonotonic time trend can 
with the refutational-different 
ie., the defense involving prior refutation of 
counterarguments different from the ones that 
are actually to be used in the attack on the 
given belief. Although this type of defense 
was inferior (p < .05) to the refutation-same 
defense in conferring to the im- 
mediate attack, it has become trivially su- 
perior when the attacks do not come until 2 


be 


seen defense, 


resistance 


TABLE 2 


MEAN BE tier Scores (on a 15-point scale 


AND ANALYSIS OF VARIANCE 
ATTACKS BY THE SAME OR BY NOVEL COUNTERARGUMENTS IMMEDIATELY 


CONDITIONS INVOLVING 
AFTER 


IN THI 
AFTER AND Two Days 


THE REFUTATION DEFENSES, witH IssuE-By-IssUE SUBMEANS 


Refutational-same defense 


Attack after 


Immediate attack d 
« Gays 


8.50 
12.95 
9.92 
12.92 


10.10 
12.59 
10.35 
12.41 


Chest X ray 
Penicillin 
Toothbrushing 
Annual physical 


All issues 11.08 


11.36 


Source 


Type defense (refutational-same vs. ref 
utational-different ) 

Time (immediate attack vs. 2 days) 

Type X Time 

Issues 

Issues X Treatments 

Subject 

Residual 


Total 


Refutational-different defense 


ii treatments 

— . Attack after 
mediate attack > daw 

« Gays 
8.96 
11.92 
9.84 
10.92 


9.9? 


10.41 


_ 
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} “ 


TIME (IN YS) BETWE 


la. Based on data from previous experiments 
(Zero interval from Mc- 
Guire, 1961a; 2-day interval points, on McGuire 
and Papageorgis, 1961, and McGuire, 1961b; and 
7-day points, on Papageorgis and McGuire, 1961.) 


points are based on data 


Fic. 1. Persistence of the resistance to persuasion 
tive, refutation of the 


ferent from those used in the attack 

days later (see Table 2). The interaction ef- 
fect between this same vs. different refuta- 
tional-defense variable and the immediate vs 
2-day interval variable appears on all four 
issues individually, and the effect combined 
over issues is significant above the .05 level. 
Hence, the second hypothesis—that the decay 
of the resistance conferred against attacks by 
different counterarguments will be slower than 
that to attacks by the same counterarguments 
as had been refuted—is confirmed. As can be 
seen in Figure 1b, there is actually greater 
resistance in the “different” refutation condi- 
tion than in the “same” at both the 2- and 
7-day intervals, which is embarrassingly more 
than the theory demands but this differential 
is trivial in magnitude. 

The results also corroborate the third hy- 
pothesis, regarding a delayed-action effect in 
the resistance conferred by refutation of coun- 
terarguments different from those used in the 
attack. As can be seen in Table 2, the resist- 


NEI THER-ATTACK- 
NOR- DEFENCE 
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xe 
DIFF 


REFUTE, 


REFUTE, Si 
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TIME (IN DAYS) BETWEEN DEFENCE AND ATTACK 
1b. Based on data from the present experiment 


as shown in Table 1 


conferred by three types of prior defense: suppor- 


same counterarguments as used in the attack, and refutation of counterarguments dif- 


ance conferred by this type of defense is 
greater against an attack 2 days later than 
against an immediate attack on all four issues 
individually, as well as in the combined re- 
sults (p < .05). 

In general, the results from this experiment 
agree Closely with the cross-experimental com- 
parisons from the previous studies in the in- 
troductory section. As can be from a 
comparison of Figures la and 1b, the two sets 
of curves are quite similar in shape and even 
as regards absolute parameters, except that 
in the present experiment, the refutational- 
different defense tends to be somewhat more 
effective than in the previous experiments 
The general implication of the study, particu- 
larly when considered in the context of the 
previous studies in this series, is to corrobo- 
rate further the initial postulate: that the 
supportive defense confers resistance to per- 
suasion only in so far as the material pre- 
sented is assimilated and retained—an activ- 


seen 
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ity that the subject is little motivated to carry 
out in the case of a “truism.” The resistance 
conferred by the refutational defense, on the 
other hand, derives not only from the assimi- 
lation and retention of the bolstering material 
actually presented but also from the motiva- 
tional effect of the pre-exposure to threaten- 
ing material, the mention of the counterargu- 
ments, contained in the refutational defense. 
If this interpretation is correct, then the tem- 
poral differentials found among the defenses 
in this study should be reduced as we move 
from the truisms used in this study, with re- 
spect to which the defense stimulating threat 
is particularly necessary and possible, to be- 
liefs on more controversial issues. The results 
do seem fairly general as regards truisms, as 
can be seen in the trivial magnitude of the 
Issues X Treatments interaction effects (see 
Table 2). 


SUMMARY 


Theoretical considerations like those which 
led to the predictions tested in the previous 
studies of this series on immunizing beliefs 
against persuasion yielded several hypotheses 


regarding differential persistence of the im- 
munity conferred by various types of prior 
belief-defenses. First, it was predicted that 
the immunity conferred by refutational de- 
fenses would decay less rapidly than that con- 
ferred by the supportive defenses. Secondly, 
within the refutational-defense conditions, it 
was predicted that the conferred resistance 
to attacks by counterarguments other than 
the explicitly refuted ones would decay less 
rapidly than resistance to attacks by the very 
counterarguments refuted. A related third pre- 
diction was that there would be a delayed ac- 
tion effect in the immunity to attacks by novel 
counterarguments conferred by the refuta- 
tional defense. 


Witiiam J. McGuire 


Each of the 160 college students subjects 
served in two experimental sessions. The first 
involved reading defensive articles on medi- 
cal truisms. The defenses involved either 
arguments supporting the truism, or refuta- 
tions of counterarguments against the truism, 
either the very counterarguments to be used 
in the later attack or alternative counterargu- 
ments. The second session came either 2 days 
(for 80 subjects) or 7 days (for the other 80) 
after the first, and involved a second defensive 
treatment on another truism and then at- 
tacks on the previously defended and unde- 
fended truisms. The subjects’ beliefs on all 
the truisms were then measured. All three 
hypotheses received substantial confirmation 
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This study had two principal objectives: to 
compare the acquisition of stimulus-response 
associations involving neutral and aversive re- 
sponse members; to study the acquisition of 
such associations as a function of two methods 
of paired-associate learning. The two paired- 
associate paradigms included the conventional 
paired-associate task (PAC) in which the 
subject is presented with a stimulus word and 
has to anticipate the correct response term, 
and a modified paired-associate paradigm 
(PAD) described by Ramond (1953) which 
necessitates a discrimination by the subject. 
The PAD technique is described in greater 
detail later. 

Two recent studies (Jacobs, 1955; Laffal, 
1952) suggest that it is more difficult for col- 
lege subjects to establish associations involv- 
ing disturbing or aversive material than those 
involving neutral material. Laffal, for exam- 
ple, reported a significant difference at be- 
tween the .05 and .01 between the 
learning of disturbing and neutral items. The 
absolute mean difference in trials to learn was 
relatively small (1.19). It may be noted that 
in these studies, the so-called emotional or 
disturbing words were specially selected on 
the basis of such criteria as reaction time in 


levels 


a prelearning word association task, GSR, or 
type of association. That is, the critical words 
were idiosyncratic and differed for each sub- 
ject. Although in both studies the critical 
words served as response members in a con- 
ventional paired-associate task, the stimulus 


members of the items differed. Laffal em- 


1 A portion of this paper was presented at the 1959 
Annual Meetings of the Eastern Psychological As 
sociation 


ployed picture stimuli, Jacobs used nonsense 
syllables. In the present study both stimulus 
and response words were the usual two syl- 
lable adjectives selected from the commonly 
used calibrated lists including the Haagen 
(1949) and Melton and Safir (Hilgard, 1951) 
materials. 

The present study extended the range of 
subjects to include not only college subjects 
but also Veterans Administration hospitalized 
schizophrenics and Veterans Administration 
nonpsychotic patients. It should be pointed 
out that interest centered not on a compari- 
son of levels of performance as between the 
college and hospital subjects, but on the intra- 
group learning patterns relative to the classes 
of words employed. 

The inclusion of hospital patients was sug- 
gested by the following considerations. It may 
be noted that even with the utilization of 
specially selected materials for each subject, 
the absolute difference in the learning of 
emotionally toned and neutral associations by 
college subjects has been relatively small. In 
recent studies involving schizophrenics, not 
necessarily learning studies, the findings have 
been interpreted as suggesting that schizo- 
phrenics are peculiarly sensitive to aversive 
content (Dunn, 1954) and that they tend to 
avoid aversive personal associations (White, 
1949). This led to the expectation that schizo- 
phrenics particularly should have difficulty in 
acquiring associations involving aversive or 
affectively toned words. The decision to study 
the learning method variable was stimulated 
by the rather unusual findings (described 
later) obtained with PAD, the learning 
method initially employed. 
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TABLE 1 


LEARNING MATERIAL WITH CORRESPONDING AVERSIVE 
NEUTRAL AND NEUTRAL-NEUTRAL Parrs 


Neutral-Aversive pairs 


Neutral-Neutral pairs 
(A-Na) .-N 


Stimuli Response 


HOPEFUI 
COVERT 


FILTHY 
KINGLY 


EVEN 
LEVEL 


EQUAI 
ALIKE 


PLEASANT 
CHEERFUI 


HUMBLI 
SINCERE 


GUILTY 
DISTINCT 


UPPER 
HIGHER 


CONSTANT 
STEADY 


ALIVE 
FAITHFUI 


EVIL 
SPECIAI 


THOROUGH 
COMPLETE 


FROSTY 
PROFOUND 


TOTAL 
ENTIRE 


READY HATEFUL 
PREPARED NOVEL 


METHOD 
Subjects 


One hundred and twenty subjects ? divided into six 
groups of 20 participated in the study. Three groups 
(60 subjects) were run with PAD and corresponding 
groups were run with PAC. The groups employed 
with PAD had the following characteristics: 

1. Schizophrenic subjects were male patients in a 
Veterans Administration hospital diagnosed as the 
paranoid type in partial remission. Their chrono- 
logical age range was from 30 to 46 with a mean of 
37.7 years 

2. A Veterans Administration hospitalized nonpsy- 
chotic group was selected from a Veterans Adminis- 
tration general medical and surgical hospital. This 
group was convalescing from a fairly wide range of 
physical ailments, for example, hernia surgery, pneu- 
monia, hypertensive vascular disease. Mean age was 
36.3 with a range from 19 to 50 years 

3. College subjects were male students enrolled in 
an evening graduate psychology course. They ranged 
from 25 to 48 years with a mean age of 31.8 

The subjects run with the PAC technique were 
drawn from the same sources as the above subjects 
and had the following characteristics: The schizo- 
phrenic group age range was from 24 to 45 years 
with a mean of 35.8 years; the nonpsychotics ranged 
in age from 23 to 50 with a mean of 36.5 years; the 
college subjects ranged in age from 21 to 52 with a 
mean of 32.5 years. 

The hospital groups on the basis of their WAIS 
vocabulary scores ranged in intelligence from dull 
normal to superior. The mean intelligence of these 
groups would place them in the low average to av- 
erage range. No comparable data were available for 
the college subjects. 


2 Appreciation is expressed to the Managers and 
Staff of the Veterans Administration Hospitals, Leech 
Farm Road, and University Drive, Pittsburgh, for 
making space and subjects available. 


Learning Material 


PAD. The material consisted of eight item pairs 
constituting a list totaling 16 items. A stimulus word 
and two response alternatives composed an item 
The subjects’ task was to select the correct alterna 
tive. The stimulus words for each item pair were 
synonyms while the response alternatives were identi 
cal. For each stimulus a different response alternative 
was correct. In the following example of an item 
pair the arrow points to the correct response 


kingly kingly 


* 


filthy : filthy 


In four of the item pairs the response alternatives 
included an aversive and neutral word (A-N, pair 
ing). In the other four pairs both response alterna- 
tives were considered neutral (N-N pairing). Stimu- 
lus words were so selected that the association value 
between stimuli of a pair of items as calibrated by 
Melton and Safir (Hilgard, 1951 
A-N, and a corresponding N-N pair 
response alternatives were equated for frequency ac 
cording to the Thorndike-Lorge (1944) list. Corre 
sponding A-N, and N-N item pairs are listed oppo 
site each other in Table 1. In Table 1 the stimulus 
and response words for the A-N, pairs are in the left 
half and the stimulus and response words for the 
N-N pairs are in the right half. The four aversive 
words used were FILTHY, GUILTY, EVIL, and HATE 
ruL. The judgment of aversiveness was made on an 
a priori basis by the experimenters and was checked 
informally with other staff psychologists 

PAC. The same learning list was used in the PAC 
paradigm. An item now consisted of a stimulus word 
and one response word which had to be anticipated 
Each stimulus word was paired with the same cor 
rect response word as in PAD. The 16 stimulus re 
sponse pairs in Table 1 now again constitute these 
individual items 


was equal for an 
Similarly the 


Apparatus and Procedure 


For both phases of the study the learning material 
was presented on a Hull-type memory drum. There 
were five different orders of the list. For PAD each 
item presentation consisted of the simultaneous ap- 
pearance in the aperture of the drum of both the 
stimulus and two response alternatives. Each item 
was exposed for 4 seconds with an intertrial interval 
of 8 seconds. In the PAC phase the stimulus word 
appeared in the aperture for 4 seconds, then the 
stimulus and response words appeared together for 
4 seconds. The intertrial interval again was 8 seconds 

In the PAD paradigm this general procedure was 
employed. After the subject was seated before the 
apparatus a set of instructions was read aloud to 
him. In general the subject was informed of the na 
ture of the experiment and then was told that he 
would see a succession of items of one stimulus word 
on the left and two response words on the right and 
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that his task was to learn in each instance which of 
the two words on the right was correct. He was to 
indicate his choice by calling out the response word 
he thought was correct. If the subject called out the 
correct choice the experimenter said “right” and if 
the subject’s choice was incorrect the experimenter 
remained silent. The subject was encouraged to guess 
if he was not sure. The experimenter was seated be 
hind the table on which the memory drum was 
mounted in full view of the subject. It was felt that 
an experimenta! situation in which the experimenter 
was in view would be less anxiety provoking for the 
schizophrenic group than one in which he was hidden 
behind a screen. A learning session included the fol 
lowing: the subject was first given eight trials on a 
four item practice list of two syllable nouns, 1 min 
ute rest period, 15 trials on the test list, a 2 minute 
rest period, 15 more trials on the test list. Each sub- 
ject then received 30 trials unless he reached a cri 
terion of three perfect repetitions prior to 30 trials 

For PAC the subjects were given the usual in 
structions for paired-associate learning. Otherwise the 
procedure was the same as with PAD 


RESULTS 
Performance curves for the three groups of 
subjects on the different word classes for PAD 
and PAC are shown in Figures 1 and 2, re- 
spectively. Percentage of correct responses is 
plotted against five trial blocks 
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Figure 1 indicates that the pattern of per- 
formance on the various word classes for the 
two hospital groups is remarkably similar and 
differs quite sharply from that of the college 
subjects. The top curves for the hospital 
groups depicting the courses of learning for 
the N, words (neutral paired with aversive) 
begin high at approximately 70% correct re- 
sponses but remain essentially flat showing lit- 
tle change over trial blocks. The middle curves 
for the hospital subjects are for the two sub- 
groups of correct neutral words combined 
(neutral paired with neutral). These begin at 
approximately 50% correct as would be ex- 
pected on the basis of initial chance perform- 
ance but also show little evidence of improve- 
ment in performance. The bottom curves are 
for the A words and show a very low rate 
of initial correct responding, performance ap- 
proximating 20% correct. This is clearly be- 
low chance and certainly far below the corre- 
sponding points for the upper curves. An in- 
teresting feature of this performance picture 
is that despite the initial low level of respond- 
ing and the overall inferiority of performance 
on the A words as compared with the others, 
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Course of learning for each word class with Paired-Associate Conventional (PAC) for each group 


of subjects. 


there is some evidence of improvement with 
a slow but steady rise to about the 40% level. 
In rather marked contrast the curves for the 
college subjects do not show the dramatic 
distinctness evidenced in those for the hos- 


TABLE 2 


MEAN NuMBER OF CORRECT RESPONSES AND THEIR 
STANDARD Errors Mape By Eacn Group 
ON THE DrFFERENT WorD SETs FOR 
Each LEARNING TECHNIQUE 


Schizophrenics| Nonpsychotics College 


M SE M SE 


Paired-Associate Discrimination (PAD 


4% 
44.05) 4.91) 7 


Aversive 
84.10) 4.00) 8 


Neutral- 
Aversive | 
Neutral | 64.41 


, 64.27 
All Words /| 250.60) 6.3 


250.70 
| | 


2.33; 76.12) 3.42 
8.03) 310.35 12.65 


Paired-Associate Conventional (PAC) 








Pst oa a 
Aversive 25.85| 4.85) 35.15) 4.42) 51.65} 5.8 
Neutral 32.68) 4.87) 36.19) 3.08) 60.02) 4.5 
4 


All Words | 123.90 18.61 143.80) 12.29) 231.85 16.49 
i | 





pital groups. They all begin at approximately 
50% and climb to roughly 80% for the last 
block of trials. There is clear evidence of 
learning by college subjects on all word sets. 
It may be noted that the curve for the A 
words does tend to fall somewhat below that 
of the Nx words, but that it also is not clearly 
discriminable from the N curve. 

Figure 2 which shows corresponding data 
for PAC indicates a greater similarity of in- 
tralearning pattern among the three groups of 
subjects. Here it will be recalled there are 
only two curves, for A and N words, since 
there could be no neutral words paired with 
aversive ones in this paradigm. All groups 
now showed clear evidence of learning on all 
word classes. Interestingly both the schizo- 
phrenic and college groups tended to do worse 
on A than on N words, a difference not sug- 
gested for the hospital nonpsychotics. 

Table 2 presents the mean number of cor- 
rect responses and their SE made by each 
group on each word class for the two learn- 
ing paradigms. The data for PAD include the 
means for three word classes based upon the 
four aversive words (A), the four neutral 
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words paired with them (N,), and the eight 
neutral words paired with each other (N). 
The PAC data include only means for two 
word classes, one based upon the four aversive 
words and the other upon 12 associations 
involving neutral response words. The mean 
values in Table 2 for the N words for the 
PAD data represent the mean of one-half the 
number of N words each subject got correct 
while the values for the N words in the PAC 
data represent the mean of one-third the num- 
ber of N words each subject got correct. The 
bottom row under each technique presents the 
mean number of correct responses and the 
SE for all words combined, that is, for the 
entire list of 16 items. 

Separate analyses of variance were per- 
formed on the data for each learning tech- 
nique. The results are summarized in Table 3. 
The analysis of variance for PAD indicates 
highly significant differences between groups 
and between word sets. The Groups X Word 
Set interaction was also highly significant. It 
will become clear that these findings are at- 
tributable largely to the differences between 
the college subjects on the one hand and the 
two hospital groups who were quite similar 
in performance, on the other. 

On the entire list, utilizing the PAD pro- 
cedure, it is evident from Table 2 that the 
college subjects achieved a considerably higher 
number of mean correct responses (310.35) 
than either the schizophrenics (250.60) or 
the nonpsychotics (256.70). These differences 
between the college subjects on the one hand 
and the hospital groups on the other were 
significant at the .01 level. The general su- 
periority of the college subjects is not too 


surprising and as stressed earlier, it was not 
a primary objective to compare the college 
with other subjects. More noteworthy per- 
haps is the similarity in performance of the 
schizophrenic and nonpsychotic groups. 
Considering now PAD performance on the 
word classes it is evident that the schizo- 
phrenics performed far less well on the A 
words than on the N, words. The very large 
mean difference of 54.7 between these word 
classes was highly significant. The nonpsy- 
chotic hospital subjects showed a remarkably 
similar pattern with a much lower number 
of mean correct responses on the A words 
(44.05) than on the N, words (84.10). Col- 
lege subjects also tended to do less well on 
the A than on the N, words but the differ- 
ence between 75.85 for the former and 82.25 
for the latter, is far less dramatic. In this 
connection it must be noted that for these 
subjects performance on A words was highly 
similar to that on the N words. In contrast 
both hospital groups performed significantly 
better on the N than on the A words. 
Turning now to the PAC data, Table 3 in- 
dicates that the between groups F was sig- 
nificant at the .01 level; the F between word 
classes now was significant at only the .05 
level. The interaction here was not significant 
in contrast to that for the PAD data. Table 2 
indicates again that college subjects generally 
performed at a higher level than the hospital 
groups which again were more similar to each 
other. Although the latter groups attained 
fewer mean correct responses on the A than 
on the N words the difference for either group 
was far smaller than it was for PAD. In fact 
the college group showed a greater mean dif- 


TABLE 3 


SuMMARY OF THE ANALYSES OF VARIANCE FOR THE Parrep-AssociATe Discrimination (PAD 
AND PArRED-ASSOCIATE CONVENTIONAL (PAC) LEARNING SCORES 


PAD 


Source 


MS 


4,547.12 
303.34 
16,885.30 
3,082.80 
296.64 


Between groups 

Between subjects 

Between word sets 

Word X Groups 

Pooled subjects X Words 
Total 


ya 
14.99** 10.96** 


6.47* 
1.10 


| 56.92°* 
10.39** 





*p- <.05. 
> <.01 
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ference between these two word classes (51.65 
vs. 60.02) than did the schizophrenics (25.85 
vs. 32.68). The nonpsychotic subjects showed 
no difference in performance on these two 
word sets (35.15 vs. 36.19). For the schizo- 
phrenics, however, the difference just ap- 
proached significance at the .05 level; for the 
college subjects the difference was not signifi- 
cant at this level. Although the schizophrenics 
achieved a lower mean number of correct re- 
sponses on the A words (25.85) than did the 
nonpsychotics (35.15), this difference was 
not statistically significant. 

Certainly performance difference between 
word classes was much greater with PAD 
than with the PAC technique. 


DISCUSSION 


The overall findings of the present study 
tend to support a conclusion that associations 
involving aversive response terms tend to be 
learned less well than associations containing 
relatively neutral terms. However, more spe- 
cific considerations indicate the need for 


qualification of such a conclusion and point 


to the operation of at least two important 
variables that may affect such learning. These 
include the learning method and Veterans Ad- 
ministration hospital status. 

Among the more dramatic findings of the 
present study were the unique intralearning 
patterns on the various word classes displayed 
by the hospitalized groups with PAD and the 
marked differences in performance patterns 
obtained with the two learning paradigms. 
The PAD data for both hospital groups tend 
to support the conclusion that not only schizo- 
phrenics but Veterans Administration patients 
generally may be particularly sensitive to ma- 
terial which they react to as aversive. Why 
the aversive words of this study, selected as 
they were from conventional learning mate- 
rials, should be as aversive as they seemed, 
is not readily clear. It is worth emphasizing 
that if only schizophrenic subjects had been 
employed one might have been willing to con- 
clude that the results obtained support a hy- 
pothesis previously mentioned, namely, that 
schizophrenics are uniquely sensitive to aver- 
sive stimulus characteristics in the environ- 
ment. The markedly similar performance of 
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the nonpsychotic subjects strongly suggests 
that Veterans Administration hospital status 
and not diagnostic label was the relevant fac- 
tor in this study. 

Moreover, why the apparent sensitization 
to A words should have shown up so differ- 
ently in the two learning situations raises the 
question of what differences in the paired- 
associate techniques may have been related 
to such results. With PAD patients tended 
from the very beginning to show evidence of 
responsiveness to the aversive alternative in 
the aversive-neutral pairs. This is not to say 
that they never verbalized the aversive words 
Their performance of only 70% correct on 
the N words suggests the contrary. This 70%, 
it is to be noted does not represent improve- 
ment in performance as it will be recalled 
that patients showed no apparent learning on 
the N, items. To interpret this absence of 
learning as being due partly or solely to the 
fact that these neutral words were paired with 
aversive ones seems to be contraverted by the 
absence of learning on the neutral-neutral 
pairs. A more important related finding is 
that despite the performance, much below 
chance, by patients on the A words, differ- 
ential reinforcement was beginning to be ef- 
fective as indicated by the evidence of sig- 
nificant improvement in performance only on 
these words. The importance of the learning 
method variable is further emphasized by the 
contrasting result for PAC which showed that 
hospital subjects were clearly capable of a 
significant amount of learning on exactly the 
same associations employed with PAD. 

An analysis of the PAD and PAC para- 
digms suggests that differences in the visual 
display that each technique presents to the 
subject may be related to the different results 
obtained with the two learning techniques. 
In the case of PAD the subject had to choose 
between two response alternatives presented 
simultaneously. This may have afforded the 
subjects a greater opportunity to observe dif- 
ferences between the response alternatives 
and assisted them in ordering the response 
words to subclasses which evoked differential 
reactions in them. The PAC procedure which 
required the subject to pull a single word out 
of 16 possible ones did not readily allow for 
such comparison and possible ordering of re- 
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sponse terms into subclasses. This is not to 
suggest that the differences in visual display 
represent the only relevant dimensions on 
which the two learning methods differ. The 
findings point up the need for further analy- 
sis and study of other factors which differenti- 
ate the methods, for example, method of rein- 
forcement, recognition vs. recall, etc. In rela- 
tion to the latter it is interesting that the 
hospital subjects at least showed more evi- 
dence of learning with PAC which requires 
recall than with PAD which is based more 
on recognition. This would appear to be con- 
trary to the usual findings indicating more 
efficient performance with recognition than 
with recall. 

Reinforcement in PAD was relatively un- 
able to overcome the reaction aroused by the 
visual display. The general absence of learn- 
ing even on the N-N word sets may have 
resulted from some complex interaction as- 
sociated with the inclusion in a single list 
of all word classes. This suggests the desir- 
ability of a study which utilizes different lists, 
cach composed of only one word class. An- 
other consideration is that PAD required a 
discrimination and that insofar as intelligence 
may be related to this kind of performance, 
patient subjects who were of lower intelligence 
than college subjects could not learn with 
PAD as they did with PAC. However, 
analysis indicated that there was no significant 
relationship between performance and _ intel- 
ligence measures for either learning method. 

So far, attention has been focused on the 
hospital subjects. It is significant that even 
the college subjects showed a trend to per- 
form less well on associations involving 
aversive responses than neutral 
These differences were much smaller and more 
in accord with those reported in previous 
studies. On PAD the difference between aver- 
sive and neutral associations was significant 
but it is well to note that the performance on 
A and N words was quite similar. This sug- 
gests that a mechanism similar to that operat- 
ing in patients may also have been operating 
for the college subjects in the case of A—N, 
items. Obviously it was not as clear for the 
college as for the hospital subjects. Moreover 
with both learning techniques, college sub- 
showed improvement in per- 


responses. 


jects clearly 


formance attaining approximately the same 
final level of 80% correct in both cases. 

In conclusion, this study points up several 
important considerations: (a) the particular 
method of paired-associate learning may be a 
relevant factor and one cannot assume equiv- 
alence for all classes of subjects of PAC and 
PAD; (6) hospital status may be an im- 
portant variable in such studies as the present 
one and it would appear relevant to consider 
the extent to which findings in the present 
study are unique to Veterans Administration 
or generalizable to non-Veterans Administra- 
tion hospital and other subjects; (c) in so far 
as aversiveness may be a relevant dimension 
in learning studies generally, it seems im- 
portant to scale on this dimension materials 
taken from frequently used sources. 


SUMMARY 


The objectives of this study were to com- 
pare the acquisition of stimulus-response as- 
sociations involving neutral and aversive re- 
sponse members; to study the acquisition of 
such associations with a conventional! paired- 
associate paradigm (PAC) and a paired-as- 


sociate paradigm involving discrimination 
(PAD). Veterans Administration hospitalized 
schizophrenics, Veterans Administration hos- 
pitalized nonpsychiatric patients, and college 
subjects learned the same 16 associations 
with the two different methods of learning 
The results in general supported the conclu- 
sion that it is more difficult to form associa- 
tions involving aversive words than neutral 
words. Both hospitalized groups showed a 
marked sensitivity to the aversive words with 
PAD and their pattern of performance was 
quite different from that of the college sub- 
jects. On the other hand the pattern of per- 
formance for all three groups with PAC was 
much more similar and the sensitivity to the 
aversive words was much less marked with 
the hospital groups. The results point up the 
following considerations as important: (a) 
the particular method of paired-associate 
learning used can be a relevant factor and 
one cannot assume equivalence of PAD and 
PAC with all classes of subjects; (4) hos- 
pital status may be an important variable in 
such studies and it would appear pertinent 
to consider the extent to which the findings 
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of the present study are unique to Veterans 
Administration subjects or generalizable to 
non-Veterans Administration hospital and 
other subjects; (c) since the aversive words 
used in the present study were taken from 
frequently used sources for learning material, 
it might be valuable to scale such material 
on a dimension of aversiveness. 
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SOME EFFECTS OF SHARED THREAT AND PREJUDICE 
IN RACIALLY MIXED GROUPS‘ 
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When members of a social system are threat- 
ened, marked changes seem to occur in social 
relationships (Jacobson & Schachter, 1954; 
Schachter, Nuttin, de Monchaux, Maucorps, 
Osmer, Duijker, Rommetveit, & Israel, 1954). 
Where the consequences of the threat and the 
responsibilities for coping with it are shared, 
an increase in group cohesion and a reduction 
in disruptive antagonisms may occur (French, 
1941; Leighton, 1945; Pepitone & Kleiner, 
1957; Sherif & Sherif, 1953; Wright, 1943). 
The application of this general finding to the 
study of particular social problems can have 
important consequences. If the social system 
in question is a society, community, or group 
containing distinct religious or racial sub- 
groups, concern about a shared threat may 
lead to a decrease in the amount of hostility 
expressed toward these minorities. 


In the first explicit attempt to test the hy- 
pothesis that shared threat reduced social 
prejudice, Feshbach and Singer (1957) pre- 
sented a set of questions to individuals de- 
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A shared threat has also been observed to in- 
crease hostility among group members In Nazi con- 
centration camps, inmates went so far as to identify 
themselves with the source of the threat (Bettelheim, 
1943; Cohen, 1953). At present it is not completely 
clear what are the necessary and sufficient conditions 
for a shared threat to reduce intermember hostility 
However, a review of the literature suggests the im- 
portant determinants are (a) the overwhelming na- 
ture of the threat, (6) the degree to which group 
action can ameliorate the threat, and (c) the degree 
to which members equally share the consequences of 
the threat and the responsibilities for coping with it 
In the concentration camp the threat was quite over- 
whelming. Group action provided little amelioration ; 
in fact, for many inmates a reduction in threat was 
only possible by dissociating themselves from the 
group. Treatment varied with the category of the in- 
mate, and little role differentiation occurred other 
than imposed by the camp administration. 
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signed to provoke concern about dangers 
which confront the community as a whole, 
e.g., floods, hurricanes, atomic attack. Im- 
mediately afterward a social prejudice ques- 
tionnaire was administered. Responses on the 
final questionnaire were compared to those 
the person made a month earlier. The authors 
reasoned as follows: 


Under the impact of a common threat . . . one’s 
reference group may become the population that is 
subjected to the danger. If this reference group now 
includes both Negro and white, whereas under ordi- 
nary stimulus conditions the reference group has 
been primarily the white population, then the social 
distance between white and Negro should decrease, 
with a corresponding decrement in social prejudice 
(p. 412). 


The results gave only weak support to the 
hypothesis. However, there are considerations 
which suggest the shared threat induced by 
this method may have been relatively weak. 
Requiring people to think about a commu- 
nity-wide disaster does not insure that they 
view it as one in which the suffering and re- 
sponsibilities are equally distributed among 
all community members. In a pilot study con- 
ducted by the senior author, 47 male students 
in the elementary psychology course at the 
University of Texas were administered the 
first four of the five “Flood and Hurricane 
Threat questions” from Feshbach and Singer 
(1957). In addition they were asked if such 
a disaster struck Austin, Texas, would all or 
nearly all socioeconomic levels, ethnic groups, 
or neighborhoods be equally affected. Only 
27% thought this to be likely. Over 30% 
thought that there would be large differences 
among various groups in the degree to which 
they suffered from such disasters. Similar dif- 
ferences occurred in regard to the distribution 
of the burden for coping with the disaster. 
Therefore, given this method of induction, the 
extent to which the subjects perceived the 
threat to be shared is ambiguous. 
Furthermore, in a highly complex social 
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system such as a community, multiple group 
membership is the rule. During a disaster, 
the person may experience severe role con- 
flicts. In spite of the perception that the 
threat is shared equally by all community 
members, the role of a father, neighbor, or 
plant manager may be more salient than that 
of community citizen. This phenomenon is 
vividly documented by Killian (1952) in his 
study of the Texas City explosion and of 
three tornado-torn towns in Oklahoma. Thus, 
even when a shared threat is perceived to 
exist in a community setting, it is uncertain 
whether the community as a whole or some 
subsystem will become the salient reference 
group. In the latter case, minorities within 
the community remain outgroups in terms of 
the social relations which are salient for the 
person at that time. Under such conditions, 
social prejudice may be unaffected. 

In order to test the hypotheses that shared 
threat reduces the expression of hostility to- 
ward minorities either one of two general pro- 
cedures can be used to minimize these proc- 
esses which vitiate the threat induction: some 
method may be introduced to assure that the 
person faced by a community-wide threat 
takes the community as the salient reference 
group, or the threat may be induced in a sim- 
pler social system in which the number of 
group memberships available to the person is 
sharply reduced. Both procedures attempt to 
decrease the likelihood that roles or reference 
groups external to the threatened social sys- 
tem become salient. The present experiment 
utilizes the second method. Members of ra- 
cially mixed groups cooperate to solve a logi- 
cal problem. In these groups, failure is clearly 
shared by all members. At the same time all 
members have a role in coping with the status 
loss that results from failure (Deutsch, 1953). 
The social system, furthermore, is simple 
enough so that under the threat of status loss 
few, if any, alternative roles are likely to be- 
come salient other than membership in the 
particular problem solving group. 

Another source of variation in the expres- 
sion of hostility toward an individual Negro 
that should be controlled is the attitude of 
the other members toward this racial group 
as a whole. The stronger the person’s anti- 
Negro attitudes, the more likely is he to be 


hostile toward a Negro member of his prob- 
lem solving group. Thus, in the present study 
anti-Negro attitudes as well as shared threat 
will be examined. 

If the expression of hostility toward a Ne- 
gro group member varies directly with the 
strength of anti-Negro attitudes and inversely 
with the degree of shared threat, then the fol- 
lowing predictions can be made: (a) high 
prejudiced individuals under nonthreatening 
conditions will express the greatest amount of 
hostility toward the Negro member; (0) low 
prejudiced individuals under shared threat 
will express the least amount of hostility to- 
ward the Negro member; (c) high preju- 
diced individuals under shared threat and 
low prejudiced individuals under nonthreaten- 
ing conditions will display an intermediate 
amount of hostility toward the minority group 
member. In the situation under study hos- 
tility may be manifested in direct evaluations 
made of the Negro, in the frequency with 
which the Negro is rejected from the group, 
and in the avoidance of communication with 
him during the problem solving interaction 


METHOD 


Subjects and confederate. Forty-eight male students 
in the elementary psychology course at the Univer- 
sity of Texas were used as subjects. Participation ful- 
filled a course requirement. Several weeks before the 
experiment they were assessed as to their level of 
anti-Negro prejudice by means of Holtzman’s D 
scale (Kelly, Ferson, & Holtzman, 1958), in the form 
of a “Student Attitude and Opinion Questionnaire.” 
This was administered by the instructors in a num 
ber of the sections of the course. The distribution of 
prejudice scores was split at the median; subjects 
falling above the median were considered high in 
prejudice, those below the median, low in prejudice 
In order to minimize the possibility of prior ac 
quaintanceship, the four subjects used in each ex 
perimental group were drawn from separate sections 

A Negro confederate was paid to serve as a mem 
ber in all experimental groups. The four other mem 
bers were, in one half of the groups, all high preju- 
diced subjects, in the other half, all low prejudiced 
subjects. The confederate participated in several pilot 
groups to attain maximum familiarity and skill with 
the type of problem to be used in the experiment. It 
was necessary to tell him about all phases of the ex 
periment and its objectives. The only information 
that was withheld from him was the extent of preju 
dice of the subjects with whom he was to work 

Procedure. Six groups were run with low preju 
diced subjects and six with high prejudiced subjects 
Within each of these two conditions of prejudice, 
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shared threat was induced in half of the group, 
while a nonthreatening or successful state of affairs 
was induced in the other half. The design, therefore, 
consisted of three groups of four subjects, plus the 
confederate, under each of the following conditions 
High Prejudice, Nonthreat (HPNT); Low Prejudice, 
Nonthreat (LPNT); High Prejudice, Threat (HPT) ; 
and Low Prejudice, Threat (LPT) 

Communication among the subjects occurred around 
a table similar to that used by Leavitt (1951). The 
subjects were seated so that each was separated from 
the next by a vertical partition extending from a 
post in the center of the table. The center post had 
slots allowing subjects to push written messages to 
other members. Direct communication was permitted 
all members. Messages written on 
colored cards corresponding to the color of the 
cubicle from which each subject operated 

As each subject arrived he was given a seat in 
front of his cubicle. When all subjects had taken 
their places, they were asked to stand and see who 
the other members were but not to engage in any 
conversation. A copy of the instructions was given 
to each member and they were asked to follow as 
the experimenter read them aloud. In summary form, 
as follows 


among were 


the instructions were 


The purpose of this procedure is to evaluate how 
groups work together in solving problems when 
communication is limited to written messages. It 
has been found that a procedure such as this can 
be used to single out groups with different levels 
of skillfulness, efficiency, and creativity. The uni- 
versity recently has become quite interested in esti 
mating how productively undergraduates can work 
together in groups. They have suggested that the 
Psychology Department initiate this program of 
evaluating groups of students with respect to these 
qualities. Thus, a record will be kept for the uni 
versity administration of the performance of the 
group participating in this preliminary testing 
Skillful, efficient, and creative group problem solv 
ing will be reflected in the time that it takes the 
long after 
starting before every correct an- 
swer. Each member will that is 
based on how well his group performs in solving 
these problems. This means, of course, that every 
body in the group gets the same grade. The grade 
a group receives will depend on how its perform- 
of other 
have 


group to solve the problem, ie., how 
member has the 


receive a grade 


ance compares to that of a large number 
college students in Texas 
same type of problem in the 


groups of who 
worked on the 


type of situation 


Same 


During the reading of the instructions the subjects 
were standing facing each other 

All groups were given four successive problems to 
solve—Leavitt’s (1951) “common symbol” problem 
They were instructed that each member had been 
given a different set of symbols and that their task 
as a group, was to discover the symbol that was 
common to all members. When a member knew what 


this symbol was, he was to put it on a white slip 


259 


and place it on top of his section of the center post. 
The group was considered to have completed the 
problem when all members had placed their white 
slip on the center post. 

At the conclusion of Task 2, subjects were told to 
stand, stretch their legs, but not to converse. They 
were seated and given an evaluation of their perform- 
ance. Half of the high prejudiced groups and half of 
the low prejudiced groups (HPNT and LPNT) were 
told that their performance was well above average 
The remaining high prejudiced and low prejudiced 
groups (HPT and LPT) were informed that they 
had performed poorly compared to the average per- 
formance of similar groups. The experimenter rein- 
forced these evaluations by making two or three 
positive or negative statements about the group’s 
performance during or immediately after both Tasks 
3 and 4. At the end of Task 4, the experimenter 
similarly evaluated the groups with respect to their 
overall performance. While the final evaluation was 
made subjects were standing in front of their cubicle 
facing each other. 

Immediately following the final evaluation a post- 
questionnaire was administered. On six-point scales, 
subjects rated the experimenter in terms of his “com- 
psychologist” and in terms of their 
Similarly, the test situation was 
its “worthwhileness” and its 
success of the 


petence as a 
“liking” for him 
rated for its “fairness,” 
“interest.” To partially 
threat induction, subjects were asked to rate how 
“depressed” they felt at the results of the test. Three 
items allowed subjects to evaluate the other four 
members. Two of these items involved ranking mem- 
bers in terms of their contribution to the solutions 
and in terms of who the subjects liked best. The 
third item required subjects to rate other members 
for their estimated “communication and 

solving skill in life.” At the end of the 
juestionnaire subjects given a sheet which 
asked if they wished to replace one of the present 
members with a new one from the subject pool at 
the next testing session. If they did desire to do so, 
the rejected member by en- 


assess the 


problem 
everyday 
were 


they were to indicate 
circling one of the four listed colors 
sponded to the color of his cubicle 
After completing the questionnaires 
iven a full explanation of the nature of the experi- 


which corre- 
subje cts were 


ments 


RESULTS 


An analysis of variance of the mean times 
required for task completion by the four ex- 


perimental groups indicates that completion 
time decreases significantly over trials (F 
8.06, p< .001). This is in accord with 
Leavitt’s (1951) findings regarding improve- 
ment in performances over successive tasks. 
To assess the success of the threat induc- 
tion, two ¢ tests were made, one on the self- 


ratings of depression as a result of the test, 
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TABLE 1 


EVALUATION OF NEGRO CONFEDERATE 





Mean Mean 
likability skill 
ratio 





— 
| Mean task 


lcontribution 
| k rank 
| 


Shared 
threat 


Shared 
threat 


Shared 
threat 


-| Ab- | Pres-| Ab- | Pres-| Ab- 
ent sent ent sent 


Prejudice 
High 
Low 


Difference tested 

HPNT vs. HPT 

HPNT vs. LPT | 3. 
HPNT vs. LPNT § 
LPT vs. LPNT ; 
LPT vs. HPT 


* Not significant. 
* Significant at .10 level. 
** Significant at .05 level. 
*** Significant at .02 level 
**** Significant at .005 level 


another on the subjects’ ratings of themselves 
and other white members for their communi- 
cation and problem solving skills in every- 
day life. The tests indicated that threatened 
subjects felt more depressed than nonthreat- 
ened subjects (¢ = 2.82, p < .01). Similarly, 
threatened subjects graded themselves and 
other white members significantly lower in ev- 
eryday communication skills than nonthreat- 
ened subjects (¢ = 3.67, p< .001). There 
were no reliable differences as a function of 
threat in regard to the subjects’ evaluations 
of the experimenter and of the test situation. 

To determine differences in hostile expres- 
sion resulting from shared threat and preju- 
diced attitude, ¢ tests were run on the post- 
questionnaire items in which the subjects 
ranked the confederate in terms of contribu- 
tion to task solutions, liking for him, and in 
which they estimated his everyday communi- 
cation and problem solving skill. It was pre- 
dicted that maximum hostility would be ex- 
pressed in HPNT conditions; the least in the 
LPT conditions; while HPT and LPNT sub- 
jects would express an intermediate amount. 
In Table 1 the mean rank for contribution to 
the solutions given to the Negro confederate 
are presented. A rank of 1 indicates the great- 


est contribution, a rank of 5, the least con- 
tribution. The order of these mean ranks cor- 
respond exactly to the predicted order. How- 
ever, only the differences between HPNT and 
HPT and between HPNT and LPT are 
statistically reliable. The difference between 
HPNT and LPNT approaches, but does not 
reach an acceptable level of significance (p 
< .10). For the mean rank given to the Ne- 
gro confederate in regard to “liking,” a score 
of 1 indicates the greatest relative liking for 
the confederate, 4 indicates the least liking. 
The Negro would be expected to be ranked 
lowest in the HPNT condition, highest in the 
LPT condition, and intermediate in the LPNT 
and HPT conditions. The results show the or- 
der of mean ranks once again conform to what 
was hypothesized. Only the differences be- 
tween HPNT and HPT and between HPNT 
and LPT are significant. On the third item, 
subjects were required to estimate their fel- 
low members’ everyday communication and 
problem solving skill. In the context of this 
item, hostility may be expressed toward the 
Negro by rating him lower than the other 
group members. Ratings of group members, 
however, were shown to be biased by the 
presence or absence of threat. This is cor- 
rected by using a ratio of the rating given to 
the Negro by each subject over the mean rat- 
ing given by the subject to all other mem- 
bers. A high degree of similarity between the 
Negro’s rating and the mean rating is indi- 
cated as the ratio approaches 1. A ratio 
greater than 1 means that the confederate is 
considered less skillful than the average group 
member, less than 1 indicates he is consid- 
ered more skillful than the average. Once 
again the obtained order fits the prediction 
exactly. However, only the differences be- 
tween HPNT and HPT and between HPNT 
and LPT are significant. 

With respect to the more general hypothesis 
that shared threat reduces the expression of 
hostility toward the confederate, responses to 
the above three items were analyzed by ¢ 
tests for subjects exposed to shared threat 
and those not exposed, regardless of preju- 
dice. The mean ranks given to the confeder- 
ate on the first two items and the mean ratio 
given on the last item by threatened indi- 
viduals were 2.50, 2.54, and 0.96, respectively. 
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The same means for nonthreatened subjects 
were 3.70, 3.04, and 1.27, respectively. The 
difference between threatened and nonthreat- 
ened subjects on the first item was significant 
at the .005 level; the difference on the second 
item was significant at the .05 level; and the 
difference between the ratios was significant 
at the .01 level. 

If they wished, subjects were given the 
opportunity to vote privately on rejecting a 
member from the group. In the HPNT condi- 
tion 9 of the 12 subjects decided to reject a 
member. Of these 9 rejections, 6 were of the 
Negro confederate. With 9 subjects making 
use of their privilege to reject 1 of the 4 mem- 
bers in their group, it is highly improbable 
that as large or a larger number of these re- 
jections would be directed toward one mem- 
ber by chance (p < .01). In the LPNT con- 
dition, 8 subjects wished to reject another 
member. The confederate received 3 of these 
rejections. Both in the HPT and LPT condi- 
tions 9 subjects decided to reject another 
member; and in each of these conditions 2 
rejections were directed toward the confed- 
erate. In none of the latter three conditions 
did the frequency of rejecting the confederate 
depart significantly from what would be ex- 
pected by chance alone. Thus, only under the 
HPNT condition, where the strongest expres- 
sion of hostility toward the confederate was 
expected to occur, is the Negro rejected more 
frequently than chance. 

Another significant source of information 
concerning the orientation of the members to- 
ward the Negro is the proportion of the task 
messages sent to him during the course of the 
problem solving interaction. Earlier studies 
have shown that interpersonal dislike can be 
coordinated to an increase in the barriers to 
communication (Festinger, Cartwright, Bar- 
ber, Fleisch], Gottsdanker, Keysen, & Leavitt, 
1948: Festinger, Schachter, & Back, 1950; 
Potashin, 1946). Thus, it is reasonable to ex- 
pect that the amount of task communication 
with the Negro would vary inversely with the 
degree of hostility felt toward him. The total 
number of messages each subject sent to all 
other subjects was counted. The percentage of 
this total which the subject sent to the con- 
federate was then computed. This was done 
only for those tasks following the initial in- 
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duction of shared threat, i.e., Tasks 3 and 4. 
The analysis of variance of these percentages 
indicates that only the F ratio (4.64) for 
prejudice is significant (p < .05). The differ- 
ences between threat conditions and between 
tasks did not approach significance. Figure | 
shows that the low prejudiced subjects, both 
threatened and nonthreatened sent a greater 
proportion of their messages on Tasks 3 and 
4 to the confederate than subjects in either 
high prejudiced condition. 


DISCUSSION 


It appears that the expression of hostility 
toward a Negro group member varies directly 
with the strength of anti-Negro attitudes, and 
inversely with the degree of shared threat. 
Moreover, prejudice against Negroes as a 
group may be expressed through a reduction 
in communication to an individual Negro. 
This is similar to Schachter’s (1960) observa- 
tions regarding communication to a persist- 
ent deviant. Of course, since there could be 
no question of the Negro changing his “devi- 
ant” position, i.e., his status as a Negro, there 
was no initial rise in communication to the 
confederate as was found by Schachter dur- 
ing the early phases of interaction. Moreover, 
it is interesting to note that avoidance of 
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communication with the Negro by high preju- 
diced subjects occurred in a situation where 
messages were of an impersonal, task oriented 
nature and where the Negro member pos- 
sessed information of value to other mem- 
bers in solving the problem. 

The prediction that shared threat would in- 
hibit tendencies to avoid communication with 
the Negro was not confirmed. No difference in 
communication to the confederate appeared 
as a function of shared threat. There are a 
number of possible explanations as to why 
shared threat reduced the expression of hos- 
tility toward the Negro in terms of direct 
evaluation on the postquestionnaire, but had 
no effect on the tendency to avoid communi- 
cation with him. The first bears on the pro- 
cedure used to induce the threatening and 
nonthreatening conditions. It will be recalled 
that the evaluations of the group’s perform- 
ance by the experimenter, which was the 
means whereby threat was induced, was not 
made until after the second task. This was 
relatively late in the problem solving process. 
The reorganization of the person’s initial 
attitude toward the Negro may take some 
time. Thus, attitude change may not have oc- 
curred in time to appreciably affect task com- 
munication. This explanation loses some of its 
force when one notes in Figure 1 that the dif- 
ferences in communication to the Negro as a 
function of prejudice occurs more markedly 
in the second two tasks than on the first two. 
An ongoing attitude change process should at 
least prevent such a difference from becoming 
more pronounced. Nevertheless, it might still 
be argued that with partitioned cubicles, a 
relatively long period of time is required be- 
fore subjects become impressed with the fact 
that one member is a Negro; and still later, 
more time is needed for failure and status 
loss to sink in. Thus, the experiment may 
have obtained a sample of behavior when the 
subjects had fully noted the presence of the 
Negro but before the shared threat had an 
appreciable effect on communication. On this 
basis it would be predicted that if more than 
four tasks were given, subjects in the HPT 
condition would eventually begin to increase 
communication with the confederate. 

A second line of reasoning assumes that 
avoidance of communication is a less direct 
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form of hostile expression than rating the per- 
son as poor with respect to certain valued 
attributes. It also focuses on what is being 
affected by the induction of shared threat. 
Are prejudiced attitudes being remolded, or 
are the expressions of hostility stemming from 
such attitudes being inhibited without any un- 
derlying attitude change? If attitude change 
had occurred under shared threat, less overall 
hostility, direct and indirect, should be ex- 
pressed toward the confederate. This did not 
occur. If, however, shared threat served to 
inhibit direct aggression without modifying 
prejudiced attitudes, our expectation would 
be quite different. In this case, it would be 
anticipated that high prejudiced individuals 
confronted by a common threat would ex- 
press a smaller amount of direct hostility to- 
ward a minority group member than equally 
prejudiced but unthreatened individuals. Both 
groups, nevertheless, would be expected to ex- 
press a similar amount of indirect hostility 
(avoidance of communication with the Ne- 
gro) which would be greater than that mani- 
fested by less prejudiced individuals. 

Finally, discriminatory behavior based on 
cultural norms and the affective orientation 
toward Negroes may under certain conditions 
be uncorrelated. One can follow the discrimi- 
natory practices of one’s group without neces- 
sarily entertaining feelings of hostility. This 
suggests a third possible interpretation. Since 
about 75% of the items on the prejudice scale 
used in this study concern appropriate behav- 
ior toward Negroes, a high anti-Negro preju- 
dice score may indicate that the person has 
strongly internalized the discriminatory be- 
havior patterns of Texas culture. The strength 
of these norms regarding behavior may not be 
appreciably modified by a momentary event 
in a temporary group. Thus, the shared threat 
induced may have produced a positive change 
in the affective orientation toward the Negro 
group member while having no influence on 
conformity to cultural patterns which stress 
avoidance of equal status interaction. 


SUMMARY 


The purpose of this experiment was to test 
the relationship between shared threat and 
the expression of prejudice hypothesized by 
Feshbach and Singer (1957). Forty-eight sub- 
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jects, varying with respect to anti-Negro 
prejudice, were placed under conditions of 
shared threat or nonthreat, in task oriented, 
cooperative work groups. A Negro confeder- 
ate was a member in each group 
It was found, as hypothesized, that under 
conditions of shared threat a reduction in the 
expression of prejudice occurs in terms of di- 
rect evaluation of the Negro by other group 
members on a posttask questionnaire. No sig- 
nificant differences in the amount of com- 
munication to the confederate occurred as a 
result of the threat induction. However, sig- 
nificantly fewer messages were addressed to 
the Negro by the high prejudiced subjects, 
regardless of the presence or absence of shared 
threat. 
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Most investigations of attitude change have 
been concerned with evaluative components 
of attitude, ie., shifts in the direction and 
strength of affective responses to attitudinal 
stimuli. Asch (1952) called attention to the 
importance of structural properties, i.e., the 
patterning or organization of cognitive ele- 
ments which comprise the attitude object. 
Although explicit concern with structural fea- 
tures was shown by Peak (1955), Smith, 
Bruner, and White (1956), Green (1954), 
and Rosenberg (1956), there have been few 
attitude change experiments dealing with 
modifications in both structural and evalua- 
tive aspects of attitudes. The present experi- 
ment examined the interaction of restructur- 
ing and revaluation as alternative and com- 
plementary modes of reduction of cognitive 
dissonance. 

Festinger’s (1957) theory of cognitive dis- 


sonance has implications for attitude change 
where a person carries out behavior contrary 


to his beliefs or opinions. Experiments on 
forced compliance show that the greater the 
cognitions supporting compliance to such dis- 
crepant behavior, the less the cognitive dis- 
sonance and consequent attitude change (as 
measured by evaluation) toward the initially 
disliked position (e.g., Cohen, Brehm, & 
Fleming, 1958; Festinger & Carlsmith, 1959). 
Two recent studies (Brehm & Cohen, 1959; 
Cohen, Terry, & Jones, 1959) have shown 
that cognitive dissonance decreases with de- 
crease in feelings of personal volition or in- 
crease in force to comply. Consequently, more 
attitude change would be expected where in- 
dividuals experience a high rather than a low 
degree of subjective choice in engaging in be- 
havior contrary to their prior beliefs. 

A second determinant of magnitude of 
forced compliance dissonance may be the ex- 

1 The present study is a portion of a dissertation 
presented to the faculty of the Graduate School at 
Yale University in candidacy for the degree of Doc- 
tor of Philosophy. The guidance and encouragement 
of A. R. Cohen and the criticisms of J. W. Brehm 
are gratefully acknowledged. 


tent to which the complier is confronted with 
the cognitions associated with his dissonant 
behavior. Such confrontation should have con- 
sequences for attitude change to the extent 
that the individual’s attention and thinking 
are focused on the implications of his dis- 
crepant stand. 

In the present study, choice and confronta- 
tion were varied to set up differing amounts 
of dissonance and consequent pressure to- 
ward attitude change. When a person feels 
that he has had little choice in carrying out 
discrepant behavior, little dissonance should 
be produced and confrontation with the im- 
plications of his action should bring height- 
ened resistance to revaluation. However, un- 
der conditions where an individual feels it 
was entirely up to him whether or not he car- 
ried out the contrary behavior, dissonance 
should be produced; the more the confronta- 
tion, the more the dissonance and consequent 
attitude change in line with the discrepant 
behavior in order to reduce that dissonance. 
Thus, under high choice conditions, there will 
be more positive attitude change under high 
than under low confrontation; under low 
choice, there will be more positive attitude 
change under low than under high confronta- 
tion. 


Attitude and Cognitive Structure 

Cognitive restructuring refers to changes in 
the organization of cognitive elements which 
comprise an attitude. Zajonc (1954, 1960) fol- 
lowed Lewin (e.g., 1951, pp. 83-84, 305-338) 
in designing empirical operations for measur- 
ing cognitive structure. Two of these meas- 
ures were employed here: the number of cate- 
gories used to group attitudinal elements 
(grouping); the number of relations among 
the elements (bonding). A problem for dis- 
sonance theory concerns the possibility that 
changes in grouping and bonding of attitudi- 
nal elements may interact with, or serve as 
well as, revaluation of those elements when 
dissonance has been aroused. A more gen- 
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eral issue concerns the cognitive resources 
that may be evoked by a person when he ex- 
periences inconsistency between his beliefs 
and his behavior. Cognitive restructuring has 
been employed (Asch, 1952; Lewin, 1951) to 
describe characteristic responses under such 
circumstances. Changes in grouping and bond- 
ing of elements are here assumed to be modes 
of cognitive restructuring. 

In the present study, subjects were given 
two measurement sequences after they had 
carried out behavior contrary to their beliefs. 
In the first, opportunity for restructuring of 
the beliefs preceded opportunity for revaluat- 
ing the beliefs; in the second sequence, re- 
valuation came before restructuring. This 
counterbalancing was aimed at exploring the 
comparative effectiveness of dissonance re- 
duction through one avenue (revaluation) 
with another possible avenue (cognitive re- 
structuring). In the absence of theory con- 
cerning the effects of dissonance on restruc- 
turing, examination was made of: (a) the 
direct effects of choice, confrontation, and or- 
der of measurement, on restructuring; (4) the 
extent to which revaluation and restructuring 
were associated under different treatment 
combinations. 


METHOD 
General Design 


After evaluative and structural measures were taken 
of the subjects’ religious beliefs, they carried out con- 
trary behavior by writing an essay opposed to these 
beliefs under conditions of high and low choice, and 
high and low confrontation. The essay topic—“Why 
I Would Like to Become a Catholic”—met the fol- 
lowing criteria: (a) subjects were clearly opposed to 
the position they undertook to support, (b) they 
were as interested and ego involved as possible, (c) 
arguments and reasons for the essay position were 
known to the subjects. Following this attempt to 
arouse differential dissonance and consequent atti- 
tude change pressure, the same evaluative and struc- 
tural measures were again administered in a bal- 
anced order. 


Subjects 


Subjects were 183 Yale non-Catholic freshmen who 


were approached in their dorms. Each was asked, 
privately, to give his “present religious affiliation,” 
and to say “how much he had considered giving it 
up in order to join another religious organization.” 


Procedure 


Premeasures of attitude and structure were ad- 
ministered under the guise of exploring a new sur- 


vey method. The subject rated three eight-point 
Likert-type items: goodness of arguments and rea- 
sons heard for “becoming a Catholic”; degree of 
“sympathy and understanding” for “someone like 
myself who had become a Catholic”; extent of per- 
sonal opposition to “becoming a Catholic.”? In or- 
der to elicit whatever cognitions occurred to him 
when he thought about his “becoming a Catholic,” 
the subject was given a packet of 16 alphabetized 
slips with the following printed instructions: 


Complete the sentence, “For me becoming a 
Catholic would mean”: in as many ways as you 
can. On each slip write a short, concise, phrase 
that completes this sentence. Write as many com- 
pletions as easily come to your mind 


When the subject had written as many completions 
as he could, he indicated what groups he saw among 
the implications (slips) he had written by following 
these instructions: 


Now lay out all the slips you have written in 
front of you. You will notice that some slips seem 
to belong with other slips. Indicate how groups 
naturally form among the slips you have written 
by writing the letters of the slips in each group 
next to the roman numerals at the left-hand side 
of this page. You may show as many groups as 
you want, but a slip may appear in only one 
group. 

Finally, to determine what bonds (relations) the sub- 
ject saw among his slips he was given these instruc- 
tions: 


Now consider each slip, separately, and ask your- 
self what other slips would have to be changed, 
modified, or excluded, if the slip you are consider- 
ing were changed, modified, or excluded. For ex- 
ample, next to letter “A” write the letters of the 
other slips that would be affected if what slip “A” 
says were changed or no longer true. 


Thus, bonds among cognitions were indicated by 
showing that changing, modifying, or rejecting one 
implication of becoming a Catholic would be ac- 
companied by alterations in one or several other im- 
plications (Zajonc, 1954, 1960) 

Experimental manipulations. A week later subjects 
were invited by phone to a room near their dorm 
for a “follow-up on the survey research.” Of the 183 
original subjects, 38 refused to participate again, all 
but 1 indicating pressures from study or social ac- 


2 The attitude items were suggested’ by a content 
analysis of essays on “becoming a Catholic” written 
by other Vale freshmen in a pilot phase of the pres- 
ent experiment. Three kinds of opposition to becom- 
ing a Catholic were displayed: personal opposition 
to this course of action; denunciation of reasons 
heard for becoming a Catholic; lack of sympathy for 
and failure to understand others who had converted 
Three items were added to provide a baseline for 
estimating possible regression effects; they referred 
to the “elimination of intercollege athletics at Yale” 
and were otherwise identical with the “Catholic” 
items, 
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tivity. Only 1 subject refused because he felt of- 
fended at having to answer questions on religion 
When subjects reported in small groups (2-4) to 
the experimental room, they were handed printed in- 
structions guiding them (with the exception of the 
choice manipulation) through the remaining experi- 
mental events. Subjects read 
Forceful, creative, and compelling essays are 
needed on the following theme: “Why I Would 
Like to Become a Catholic.” Even though this po- 
sition may not be your own, do the best you can 
to write persuasive and original arguments 


At this point, in the High Choice condition, the ex- 
perimenter said: 

I would like to emphasize that even though you 
came over here tonight, there is no obligation at 
all to write this essay. If you don’t want to write 
it you can get up and walk out if you wish. Is 
that clear? 


Most subjects nodded. The experimenter turned to 
each subject individually and asked: 


You know that its entirely up to you whether or 
not you write the essay? Are you sure? 


All but seven subjects agreed and the manipulation 
was concluded after each had said “Yes” at least 
once. The loss of seven subjects who left the ex 
periment showed that the option to write the essay 
was effectively communicated. (These 7 subjects and 
the 38 refusing to participate further in the experi- 
ment did not differ from the others on the premeas- 
ures of evaluation and structure. Therefore, it was 
felt that their attrition did not bias the results.) In 
the Low Choice condition, no such option was pre 
sented; after the subjects read the essay instructions, 
the experimenter said “please start writing now.” 

After 10 minutes, subjects in the High Confronta 
tion condition were instructed to write out their 
essays again, ranking the sentences in terms of how 
“original and persuasive” they perceived their argu- 
ments to be. Here, the original material was copied 
over except that the sentences were reordered by the 
subject to show his assessment of their originality 
and persuasiveness (an emphasis on the “meaning” 
of the essay). In the Low Confrontation condition, 
subjects were instructed: 

Copy your essays over using the number-of- 
syllables chart on the next page. List every word 
of your essay in the column appropriate to its 
number of syllables 


Under Low Confrontation, all the essay material was 
copied again but the emphasis was on grammar 
rather than meaning. Thus, the only difference be- 
tween the High and Low Confrontation conditions 
was the extent to which the subject’s thinking was 
focused on the implications of his reasons and argu- 
ments. 

Following these manipulations, the premeasures 
were readministered. Approximately half the subjects 
rated the attitude scales and then were given a packet 
of slips to fill out and indicate groups and bonds. 
This condition, where the measure of evaluation 
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preceded the measure of structure, is labeled Evalua- 
tion-Structure. In the Structure-Evaluation condition, 
the measure of structure preceded the measure of 
evaluation. 

A questionnaire assessed the effectiveness of the 
manipulations and the extent to which alternative 
modes of dissonance reduction, such as dissociation, 
had been employed. Finally, the experiment was ex- 
plained and subjects were asked not to reveal its 
purposes for an appropriate length of time 

Selection of subjects. In order to assure that writ 
ing the essay was indeed dissonant, only those sub 
jects were used who completed the attitude scale 
measuring personal opposition with opposed, very 
opposed, or completely opposed. A person who devi 
ated on this attitude premeasure by more than a 
half-scale unit in the favorable direction from op- 
posed was dropped from the entire study. Data from 
11 subjects were omitted for this reason; thus, 12 
subjects constituted the present sample 


RESULTS 
Effectiveness of the Experimental Manipula- 
tions 


In completing the final questionnaire, the 
subject indicated on eight-point scales his 
perceived degree of “obligation to write the 
essay” and the extent to which he felt “writ- 
ing the essay was up to him.” The High 
Choice subjects reported less obligation (¢ 

2.12,3 p < .05) and more option (¢ = 3.92, 
p < .001) than Low Choice subjects. Similar 
scales called the subject’s attention to his re- 
writing of the essay: he was asked to rsport 
his degree of “awareness of the meaning of 
what” he wrote, and the extent to which he 
“deliberated upon the implications and con- 
sequences” of his “reasons and arguments.” 
The High Confrontation subjects reported 
more awareness of meaning (¢ = 4.47, p< 
.001) and more deliberation (¢ = 6.24, p< 
.001) than the Low Confrontation subjects 


Effects of Choice, Confrontation, and Order 
of Measurement, on Attitude Change 


As anticipated, the three attitude scales 
measured somewhat different aspects of a per- 
son’s feeling toward “becoming a Catholic”’: 
overall correlations of change scores were 
positive but not significantly different from 
zero. Since all three scales were considered 
meaningful and relevant, changes on each 
scale were combined by addition to provide 
an index of amount of revaluation of beliefs, 


3 All t tests in the two-tailed 


unless otherwise noted 


present report are 
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following the writing of the essay. Support 
was found (Table 1) for the general expecta- 
tion that more positive attitude change (in 
favor “of becoming a Catholic’) would be 
obtained under High than under Low Choice 
conditions (p < .05). In addition, the pre- 
dicted interaction effect of choice and con- 
frontation was obtained: under High Choice 
more favorable revaluation occurred under 
High than under Low Confrontation: under 
Low Choice, this effect Order 
of measurement (Evaluation-Structure vs. 
Structure-Evaluation) did not affect the in- 
teraction: combining data for the 
measurement conditions yielded confirmation 


05).4 


was reversed 


order of 


at an acceptable level (? - 


* Between-condition differences on each of the scales 
also supported the predicted interaction. Although it 
(after the fact) that more compelling 
001) could have been obtained by 


was obvious 
confirmation () < 
ignoring changes on the first scale, subsequent men 
tion of amount of revaluation refers to combined 


changes as reported in Table 1 


High Choice 


Low Choice 


Evaluation-Structure 


High Low 
Confrontation | Confrontation 
12.33(15 0.5315 6.43 

7.53(53) 
4.89(19) 7.30(20 6.13 
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8.18 4.40 

6.26(69) 


sis of Variance* 


n parentheses. The 


for restructuring pr 


for restructuring 
method by Walker and 


pportunity tor evaluative che 


Some possible relations between dissonance 


explored 
if the re- 


and cognitive restructuring were 
using the data in Table 1. First 
structuring operation (grouping and bonding 
implications of becoming a Catholic) reduced 
dissonance, less attitude change should be 
obtained when opportunity for restructuring 
preceded opportunity for revaluation. Evi- 
dence in this direction was provided by an 
overall tendency for less positive attitude 
change in the Structure-Evaluation than in 
the Evaluation-Structure conditions: 1.12 
< 6.26 (¢ test p < .10). Second, if the as- 
sumption was correct that restructuring, like 
revaluation, reduced dissonance, then, where 
little dissonance was produced, one or the 
other of these avenues of reduction might 
suffice. Consequently, under Low Choice (low 
dissonance), less favorable revaluation could 
be expected to occur when opportunity for it 
was preceded by opportunity for restructur- 
ing. Under Low Choice, less attitude change 
was obtained when opportunity for restruc- 
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turing came first: — 7.74 < 6.13 (¢ test p< 
OL). 

While the statistically large differences in 
Table 1 suggested that the restructuring op- 
eration reduced dissonance, one contrary dif- 
ference was noted. Under High Choice, there 
was slightly more attitude change under the 
Structure-Evaluation conditions than under 
the Evaluation-Structure conditions: 8.96 > 
6.43 (nonsignificant). This result, in conjunc- 
tion with the reversal under Low Choice, ac- 
counted for the significant interaction be- 
tween choice and order of measurement (p 
< .05). 


Effects of the Independent Variables on 


Amount of Restructuring 


The implications seen by each subject of 
his “becoming a Catholic” were analyzed in 
terms of the number of slips written, change 
in content from before to after writing the 
discrepant essay, change in the number of 
cognitive groups, and change in the number 
of cognitive bonds.’ There were no differ- 
ences between the eight experimental condi- 
tions in the sheer number of implications 
(slips) written. Both times the slips were 
written (before and after the essay) the over- 
call average was about 7 slips, with a range 
from 3 to 16. There were similarly no differ- 
ences in the extent to which new implications 
were indicated following the essay or in the 
extent to which pre-essay implications failed 
to be mentioned after. the essay. Approxi- 
mately 80% of the original implications were 
repeated in all experimental conditions. Per- 
sons very opposed to becoming a Catholic ap- 
peared to hold attitudes comprising elements 
that were stable and unchanging. Subjects in 


5 Typical implications were: “losing my friends,” 
“giving up clear thinking,” “not being able to prac- 
tice birth control,” “hurting my parents,” “becoming 
part of a world organization,” “being obliged to go 
to church every week,” “feeling guilty more often 
than necessary,” etc. The Protestants, Jews, agnos- 
tics, and atheists, who were approached by the ex 
perimenter, saw a variety of mainly unfavorable im- 
plications. 

Where N= number of cognitive elements (slips 
written), change in groups signifies differences in 
groups/N from before to after essay writing; change 
in bonds signifies difference in bonds/N (N— 1). 
Each denominator, of course, represents the possible 
number (of groups and bonds). 
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all conditions indicated about the same num- 
ber of groups and bonds on the premeasure 
before the essay. 

Changes in cognitive grouping. Change 
scores showed a general tendency toward 
fewer groups in all conditions except one. 
However, an analysis of variance yielded no 
significant outcomes and thus eliminated the 
possibility that the independent variables di- 
rectly influenced regrouping. It was possible 
that whatever restructuring accomplished for 
the reduction of dissonance was achieved by 
decreasing rather than increasing the num- 
ber of groups. The unavailability of control 
subjects (no essay writing) prevented ex- 
ploration of this hypothesis. 

Change in cognitive bonding. The mean 
amounts of change in number of bonds re- 
ported in Table 2 showed no main effects but 
the Choice < Confrontation interaction was 
significant (p< .01). Under High Choice, 
there was less decrease in bonds under High 
than under Low Confrontation; under Low 
Choice, less decrease under Low than under 
High Confrontation. These differences in 
amount of bond change exactly paralleled 
those for attitude change (see Choice x Con- 
frontation, Table 1). They suggested that 
relatively less decrease and/or increase in 
bonding reduced dissonance. “Less decrease 
in bonds,” in spite of its descriptive accuracy, 
is conceptually unwieldly. Since specifying 
bonds was somewhat laborious, doing it a 
second time (following the essay) probably 
evoked less interest and more fatigue factors, 
that may have accounted for an overall tend- 
ency to decrease bonds. Thus, the only sig- 
nificant outcome favored the possibility that 
dissonance was reduced by increasing rela- 
tions among implications of “becoming a 
Catholic.” 


Correlational Results 


The extent to which change in groups and 
positive attitude change were associated ap- 
peared to be affected by the independent 
variations (Table 3). If decrease in groups 
facilitated favorable revaluation, these vari- 
ables should be positively correlated under 
conditions where attitude change was rela- 
tively great. These conditions were, under 
High Choice, High Confrontation, and, under 
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TABLE 2 
MEAN AMOUNT OF CHANGE IN COGNITIVE BONDING BY EXPERIMENTAL CONDITION 
ANALYSIS OF VARIANCE 


Means 


Structure-Evaluation 


Low 
Confrontatior 


High 
Confrontation 


High Choice 5.80 »12 2 


Low Choice 6.88 -1.09 
6.45 
4.43 


Evaluation-Structure 


Low 
Confrontation 


High 
Confrontation 


High Choice —10.53 
—6.94 
Low Choice 
~ 5.51 


1.95 


~3.11 


Analysis of Variance* 


Choice 

Confrontation 

Order 

Choice X Confrontation 
Choice X Order 
Confrontation X Order 
Choice X Confrontatior 
Error 


x Order 


more posit 


to Walker and Lev (1953 381) 


Low Choice, Low Confrontation. Here, de- 
crease in groups ind positive attitude change 
were positively associated while, in other con- 
ditions, inverse relations occurred. Although 
only two of the four relevant comparisons in 
Table 3 were statisticaily significant, the pat- 
tern of differences in amount of correlation 
suggested that decrease in groups facilitated 
attitude change. 

In a similar analysis of change in bonds 
(Table 4), substantial correlations between 
bond increase and positive attitude change 
were obtained only under conditions produc- 
ing relatively greater attitude change. This 
outcome corroborated the direct effect of 
choice and confrontation on change in bonds. 
Hence, increasing relations among implica- 
tions of “becoming a Catholic” may have fa- 
cilitated favorable revaluation of anti-Catho- 
lic beliefs. 

There were no significant differences be- 
tween conditions in the extent to which 
change in groups and in bonds were associ- 
ated. The overall correlation of these vari- 
ables was only .12 (p > .20). 


MS 


& 


4.66 <1 
36.59 1.87” 
8.41 <1 
327.55 9.60*** 
73.92 2.17% 
11.47 | 
33.31 
34.09 


Smo et ee tet et et 


— 
— 


ve the mean, the greater the increase in number of cognitive bonds. Cell Ns are the same as in Table 1. 


Alternative Avenues of Reduction and Arti- 
facts 

Opportunity was provided for two general 
modes of coping with dissonance: revaluation 
and restructuring. However, the results might 
be explained in terms of dissociation from the 
essay behavior, discounting the essay as an 
academic exercise, differential effort to write 
the best essay possible, perception of the ex- 
periment as meaningless. An analysis of the 


TABLE 3 


RANK-ORDER CORRELATIONS BETWEEN DECREASE IN 
CoGnitive Groups AND PosITIvE 
ATTITUDE CHANGE 
Evaluation-Structure 


Structure- Evaluation 


Low Con 
frontation 


High Con 
frontation 


Low Con 
frontation 


High Con 
frontation 


High +-.28 — 31 22 15 
Choice 

Low 
Choice 


— 39 43" 4+. 4> 


Note.—Cell Ns are the same as in Table 1. » values computed 
for one-tailed ¢ tests 

* Difference between —.39 and +.40, p < .02 

> Difference between —.43 and +.18, p < .04 
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rABLE 4 


RANK-ORDER CORRELATIONS BETWEEN INCREASE IN 
CoGnitivE Bonps AND POsITIVE 
ATTITUDE CHANGE 


Structure-Evaluation | Evaluatior 


Low Co 


frontat 


High Con 
frontation 


Low Con 
frontation 


High Con 
ftrontation 


High +52 + .03 +.45 


Choice 
Low 37 31° 
Choice 


Note.—Cell Ns are the same as in Table 1. p values 
for one-tailed ¢ tests 

® Difference between 37 and +.31, p <.03 

» Difference between 18 and +.36, p < .06 
attractiveness of these alternatives (as meas- 
ured on the postquestionnaire) showed that 
they could not account for the results. Other 
eliminated explanations concerned potential 
artifacts: obtained attitude change was due 
to statistical regression; under some condi- 
tions more compelling essays were written; 
persons likely to be especially resistant to 
adopting a more favorable attitude toward 
becoming a Catholic, e.g., Jews and/or athe- 


ists, were disproportionately represented in 
the experimental subgroups. Detailed discus- 
sion of these points is found in Brock (1960) 


DISCUSSION 
The attitude change results supported the 
theoretical deviations. Persons choosing to 
carry out behavior contrary to their opinions 
are strongly motivated to reduce the dis- 
sonance thus produced by revaluation of the 
opinions so that they are more favorable to- 
ward the discrepant behavior. This effect of 
choice was consistent with effects of this vari- 
able in other studies reviewed by Cohen 
(1960) and supported his summarization: 
Where choice is varied, 
sonance theory are fulfilled only 
conditions; under low-choice conditions, straightfor 
ward motivational or resistance effects seem to a 
count for the results (p. 306). 


expectations from dis 


under high-choice 


Under high dissonance (High Choice) , opin- 
ion change in the direction of contrary be- 
havior is greater to the extent that aware- 
ness of the meaning of the behavior and de- 
liberation on its implications are heightened: 
when motivation to reduce opinion—behavior 
inconsistency is weak (Low Choice), con- 
frontation with the behavior implications may 
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evoke “rebellion,” strengthening of the origi- 
nal opinions, or, at 
valuation in a direction consonant 
contrary behavior 

An exploration of the effects of 
nance on cognitive restructuring ruled out two 
the restructuring operations in- 
dissonance and restruc 
Change in 


resistance to re- 
with the 


least, 


disso- 


possibilities: 
creased dissonance; 
turing were entirely 
cognitive groups was not directly influenced 
by the independent variations. The interac- 
tion effect of choice and confrontation on 
bond change did not allow an unequivocal 
interpretation. A pattern of dif- 
ferences in the degree of association between 
each of the modes of restructuring and atti- 
tude change was noted. However, only half 
of the relevant comparisons were statistically 
significant. These results cannot be the basis 
for any definitive generalization about the re- 
lation between dissonance, revaluation, and 
restructuring. But they warrant the conclu- 
sion that, when dissonance is aroused by 
carrying out contrary behavior, revaluation 
of cognitions referring to the original beliefs 
e fee tively than 
beliefs are cogni- 


unrelated. 


consistent 


reduces dissonance moré 
change in the way those 
tively grouped and interrelated 

If, following contrary behavior, beliefs were 
modified by increasing their interrelatedness 
and decreasing the number of groups among 
them, these changes might be interpreted in 
a number of ways. With respect to general 
determinants of dissonance (Festinger, 1957), 
it might be argued that the salience of the 
original beliefs or their relevance to the con- 
trary behavior could be reduced by decreas- 
ing the number of categories required for 
grouping them. Salience might also be dimin- 
ished by linking the beliefs with one another 
(increasing bonds) so that fewer beliefs are 
seen as autonomous or unrelated to the others 

Osgood and Tannenbaum’s (1955) concept 
of incongruity is similar to dissonance. These 
authors might say that reduction of incon- 
gruity or “polarization along the evaluative 
dimension,” was achieved by organizing “bad” 
implications of becoming a Catholic into a 
“tight” (many bonds) and “undifferentiated”’ 
(few groups) cluster. 

Other interpretations stem from the aver- 
siveness of contemplating negative implica- 
tions of becoming a Catholic after creating 
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pro-Catholic propaganda. By increasing bonds 
and decreasing groups among these implica- 
tions, differentiations among them could be 
lost; they could be more readily labeled in an 
omnibus fashion, consigned to “logic-tight 
compartments,” and thus more easily sup- 
pressed. Too, the implications of becoming 
Catholic represented potential deprivations. 
Grouping them into fewer categories might 
enable them to be perceived as fewer losses; 
if one deprivation could be dismissed, increas- 
ing bonds among the implications might al- 
low rejection of related deprivations as well. 
All of these interpretations overlap; further 
conceptual work and less ambiguous empiri- 
cal returns are needed before a clear theo- 
retical preference can be stated. 


SUMMARY 


The general objective was to investigate to 
what extent and under what conditions cog- 
nitive restructuring interacts with evaluative 
change to reduce cognitive dissonance. Dis- 
sonance was produced by varying the extent 
of choice and confrontation under which in- 
dividuals engaged in discrepant behavior. The 
subjects were non-Catholic Yale freshmen 
who wrote persuasive essays in favor of “be- 
coming a Catholic.” After dissonance was cre- 
ated in this fashion, half of the subjects were 
first given the opportunity to revaluate their 
former anti-Catholic beliefs and then given 
the opportunity to restructure these prior be- 
liefs. The other half of the subjects were given 
these measures in reverse order. The revalua- 
tion procedure consisted of administration of 
attitude scales relevant to “becoming a Catho- 
lic.” The restructuring procedure consisted of 
instruments designed to elicit subjectively 
perceived implications of “becoming a Catho- 
lic,” and the way in which these implications 
were grouped and interrelated. 

The results showed an interaction be- 
tween choice and confrontation in determin- 
ing amount of attitude change. Persons who 
were given a choice to write or not to write 
the essay (high dissonance) evinced more 
change to the extent that they were con- 
fronted by the implications of what they had 
written (High Confrontation). Persons who 
were given no option to create pro-Catholic 
propaganda (low dissonance), evinced more 


RESTRUCTURING AND ATTITUDE CHANGE 


271 


resistance to favorable change under High 
than under Low Confrontation. 

The independent variables had no effects 
on cognitive grouping but cognitive inter- 
relatedness, like revaluation, appeared to be 
an interactive function of choice and con- 
frontation. Positive correlations were obtained 
between revaluation and increased homogene- 
ity and interrelatedness under conditions pro- 
ducing relatively greater positive attitude 


change. 
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AN INVESTIGATION OF CLINICAL JUDGMENT: 
A STUDY IN METHOD' 
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The clinical psychologist as diagnostician 
has provided a large share of the data avail- 
able from the myriad studies of judgment al- 
ready in the literature. Whether the clinician 
as subject is taken én situ, however, or ob- 
served under more controlled circumstances 
working with selected test protocols, two prob- 
lems arise in understanding both the process 
and product of his judgments. 

First, how are we to identify the cues actu- 
ally used in making a judgment or predic- 
tion? Although most investigators provide the 
judge with certain specific data and ask him 
to base his prediction solely on that informa- 
tion, clinicians do, after all, acquire some 
biases and vagaries along the way. Rubin and 
Shontz (1960) for instance, present data show- 
ing that their judges were able to “describe” 
(using a Q sort technique) a paranoid schizo- 
phrenic patient on the basis of statistical data 
alone: date of birth, date of admission, edu- 
cation, occupation, etc. When given additional 
information in the form of a tape recording 
of an interview and test protocols, their de- 
scriptions changed very little. Although Rubin 
and Shontz term their judges’ performances 
“a process of considered discrimination” one 
can never be certain whether the final diag- 
nosis was primarily on the basis of age or high 
F + %. Even when we provide our clinician 
judge with the sparsest morsels of data in 
quantitative form, as Hoffman (1960) has 
done, we do not know what he adds to those 
data from his own frame of reference and ex- 


1 This report is based upon two dissertations (Lee, 
1960; Tucker, 1959) submitted to the Graduate 
School of the University of Kentucky in partial ful- 
fillment of the requirements for the degree of Doctor 
of Philosophy. Experiment I is part of a dissertation 
by R. Bennett Tucker under the direction of Betsy 
W. Estes. Experiment II reports data from a dis- 
sertation by Joan C. Lee with James S. Calvin as 
director of the study. 

2 Now at Howard University. 

8 Now at Veterans Administration Hospital, Lex- 
ington, Kentucky. 


perience nor precisely what contribution this 
makes to the total variance of his judgments. 

A second problem lies in the choice of cri- 
terion for assessing the validity of a given 
judgment. Frequently, the judge’s prediction 
is called right or wrong on the basis of a 
previous diagnosis by other clinicians who had 
access to more information regarding the pa- 
tient. This is a somewhat questionable pro- 
cedure in view of the evidence that the more 
information available to the judge, the less 
accurate his judgment (Gage, 1953; Giedt, 
1955; Kostlan, 1954). Rubin and Shontz 
(1960) suggest that the more obvious the 
pathology, the more likely psychologists are 
to agree on the diagnosis. While most studies 
probably include some severe and perhaps 
“obviously” pathological cases in the sample 
to be judged, they must also include many 
borderline cases as well. The judge who dis- 
agrees with the criterion in a study of this 
type could well be right and the experimenter 
wrong. 

The present paper describes two experi- 
ments in which quasiclinical judgment tasks 
were devised to study the judge’s use of multi- 
ple cues. In both, naive judges were given 
tasks unique to them in order to minimize the 
influence of preconceived biases or individual 
level of competence on their performances. In 
each of the experiments the criterion can be 
clearly and objectively defined. It is sug- 
gested that this type of investigation circum- 
vents the problems discussed above, yet still 
contributes to our understanding of the basic 
judgment process. 


EXPERIMENT I: THE COMBINING OF 
SEPARATE LEARNING EXPERIENCES 
IN PRoBABLE CUE JUDGMENTS 


Calvin and Curtin (1958) have described 
clinical judgment as “a synthesis of impres- 
sions.” This type of judgment is probabilistic 
in that a decision is reached by weighing and 
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combining signs, or cues, which have varying 
probabilities. Thus, a judgment or diagnosis 
is reached by considering a number of rele- 
vant, probabilistic cues in the presence of 
other irrelevant and conflicting cues. In order 
to investigate this type of judgment, Calvin 
and Curtin devised a task in probability learn- 
ing using series of cards varying in several 
respects: size, form, color, texture, dot-size 
and number of dots appearing on the card, 
and thickness. In a pilot study, three of these 
attributes were made relevant, probable cues 
and subjects were given the task of learning 
to identify X cards and Not-X cards. This 
appears to be analogous to the situation in 
which the clinician attempts to identify schizo- 
phrenic patients, for instance, from Rorschach 
determinants. The results of this preliminary 
study indicated that learning occurred, but 
that the subjects tended to overuse one of the 
three relevant cues in identifying X cards 


(i.e., to rely upon it beyond its degree of cue 
validity) and to underplay or totally neglect 
the other relevant cues. This type of one-cue 
learning persisted over many training trials 
despite the fact that subjects were informed 


of the correctness or incorrectness of their 
judgments. In the same task, but presented 
with two equally valid cues, another group of 
subjects showed the same tendency to over- 
use one preferred cue. 

Because it has been shown that previous 
practice with simple concepts helps with the 
learning of more complex ones (Kendler & 


TABLE 1 


PROPORTION OF THE THREE LEVELS OF COLOR 
Size Cves ComBrnep As INDICATORS 
or X CARDs 


r of Number of Percentage 
is Not-X cards | indicator of X 


6-inch red 
6-inch blue 
“— ™ 
inch yellow 
inch red 
inch blue 
10 inc h ve llow 
10-inch red 
10-inch blue 


Total 


. JUDGMENT 


TABLE 2 


[ypes or Carps HAVING THE SAME 
or VALrpity As INpIcATORS oF X 


DIFFERENT 
DEGREI 
Percentage indicator of X 


.25 .50 75 1.00 


10-inch 
blue 


6-inch 6-inch 10-inch 
red blue red 
8-inch 8-inch 8-inch 
vellow red blue 
10-inch 


yellow 


Vineburg, 1954), it seemed likely that sub- 
jects who were required to learn to use a 
previously neglected cue might be more likely 
to make judgments on the basis of multiple, 
probable cues when they were again presented 
with the favored cue and the neglected cue 
together. 


Method 


Subjects. Nine “bright normals,” six males and 
three females, served as subjects. Although no for- 
mal test of intelligence was administered, subjects 
were sufficiently well known to the experimenter to 
enable him to make an estimate of their intelligence. 
These estimates based primarily upon verbal facility, 
amount of education, profession, and interests, were 
all of “high average” intelligence, or higher. Due to 
the length and difficulty of the task, factors such as 
motivation, cooperation, and available time were also 
taken into consideration in the choice of subjects. 

Materials. Three sets of 72 training cards and one 
set of 72 test cards were constructed which varied in 
color, size, shape, texture, and size of dots appearing 
on them. 

In this experiment, cue is defined as a discriminable 
attribute of the stimulus. A probable cue is one ap- 
pearing less than 100% of the time as X. A relevant 
cue is an attribute which occurs more, or less, than 
50% of the time relevant to X-ness, while an irrele- 
vant cue occurs exactly 50% of the time relevant to 
X-ness and 50% of the time not relevant to X-ness 


Condition I 


For the 72 training cards of Condition I, color and 
size were selected as relevant cues. The proportion 
that various levels of these variables in combination 
was an indicator of X-ness is shown in Table 1. For 
instance, the 10-inch blue card was always X, while 
the 10-inch red card was called X by the experi- 
menter 75% of the time. The 6-inch yellow card was 
never an X, and the 6-inch red card was called X 
only 25% of the time. Table 2 compares the differ- 
ent types of cards having the same degree of va- 
lidity as indicators of X. It can be seen that, in gen- 
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eral, the red and blue cards are more likely to be 
called X than the yellow cards; and the larger the 
card, the more likely it is to be an X card. 

Form, dot-size, and texture were made irrelevant 
cues; ie., each level of these variables appeared 50% 
of the time as X and 50% of the time as Not-X 

These 72 cards were divided into two sets of 36 
cards and called Training Series A and B. Seventy- 
two test cards for Condition I (also used in Condi 
tions II and III) were constructed in exactly the 
same manner with color and size as the relevant cues, 
and dot-size, form, and texture as irrelevant 
These 72 cards were then randomly divided into four 
series of 18 cards each and these series were called 
Test Series a, b, c, and d. 


cues 


Condition II 


"9 


Two sets of 72 training cards were constructed for 
Condition II. In one of these sets, color was made 
the only cue relevant to X-ness. Size was held con- 
stant by making all of the cards 8 inches. The degree 
to which various colors were an indicator of X-ness 
was the same as in Condition I. Form, dot-size, and 
texture were kept as irrelevant cues. These 72 cards 
were randomly divided into two series of 36 cards 
each and called Training Series A and B 

In the other set of 72 cards, size was the only 
relevant cue. Color was held constant by making all 
cards in the series red. The degree to which various 
cards were indicators of X-ness was the same as in 
Condition I. These 72 cards were also divided into 
two series of 36 cards each and called Training Se- 
ries A and B. 

Since the same Test Series a, b, c, and d were 
used in Condition I and II, both color and size were 
cue variables in all test series. 


Condition III 


The materials for Condition III were those used in 
Condition I and the procedure of Condition I was 
replicated save for the number of training and test 
trials. 

Procedure. A minimum of 8 training and § test 
series for Conditions I and II was predetermined 
arbitrarily for all subjects. If, at the end of these 
series, the coefficient of cue utilization showed that 
the subject had not learned to use one or both of 
the relevant cues, the number of training and test 
series was increased to 12 or 16, ie., until the coeffi- 
cient of cue utilization differed significantly from zero 
(r = .34). All subjects were given exactly 8 training 
and 8 test series in Condition ITI. 

At the beginning session of each condition, the fol- 
lowing instructions were read to the subject 


I am going to hand you some cards, one at a 
time. I want you to take each card and look at it. 
Some of the cards are called “X cards” and some 
are “Not-X.” When I give you a card, I will name 
it. I will tell you whether it is an X or a Not-X. 
I want you to try to learn to identify, as well as 
you can, which cards are X and which are Not-X. 
You may hold a card as long as you like. As you 
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finish with the cards, please stack them, face down, 


to your right. 


Training Series A was shown. In presenting each 
card, the experimenter said: “This is an X card”; 
or, “This is not an X card.” After the completion of 
Training Series A, Test Series a was given, the fol- 
lowing instructions being first read to the 

] 


This is a test series. I want you to tell me 
whether you think each card is X or Not-X. 


subject 


Training Series B was then presented in the same 
manner as Training Series A. Test Series b and sub- 
sequent test series were presented as Test Series a 
had been. 

Instructions for Condition II and III were identi- 
cal with those of Condition I. However, in Condi- 
tion II those subjects who had shown a preference 
in Condition I for the color cue were given the train- 
ing series in which size was the only relevant cue 
Subjects who had relied primarily on the size cue in 
Condition I were now given the training series in 
which color was the only relevant cue. Thus, each 
subject in Condition II was trained on only one 
variable, color or size. 


Results 


Cue validity is defined as the extent to 
which a cue, i.e., a given attribute of a card, 
is a valid indicator of X-ness, as computed 
by the point biserial coefficient of correlation. 
The validity coefficient for each of the rele- 
vant cues, color and size, as a probable indi- 
cator of X cards, was .41. In combination, 
these cues were probable indicators of X 
cards with a validity coefficient of .58. The 
extent to which a subject uses a given cue to 
categorize a card as X or Not-X may be esti- 
mated by correlating cue dimensions with the 
subject’s judgments of X or Not-X. Coeffi- 
cients of cue utilization were computed by 
combining two test series of 18 trials each, 
giving a total of 36 trials. With 33 df, coeffi- 
cients of utilization differed with statistical 
significance from zero at the .05 level of con- 
fidence when r = .34 or above. 

Subjects may be said to use, overuse, un- 
deruse, or not use a relevant cue. For in- 
stance, if the coefficients of cue utilization for 
four subjects are .00, .22, .34, and .68, while 
actual cue validity is .41, the first subject is 
not using the cue. The second, whose coeffi- 
cient is .22, is underusing the cue. The third 
subject is using the cue, and the fourth sub- 
ject, whose r = .68, is overusing the cue since 
his reliance upon it exceeds the extent to 
which the cue is a valid indicator of X-ness. 
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Six of the nine subjects showed one-cue 
rather than multiple-cue learning in Condi- 
tion I. These subjects tended to focus upon 
one relevant cue and to overuse this cue in 
making their judgments. Although several at- 
tempts to use both relevant cues, as well as 
incidents of switching from one cue to an- 
other as a basis for judgments, were noted, 
at the end of Condition I all of these subjects 
were overplaying the preferred cue to the al- 
most total neglect of the other relevant cue. 
Two subjects showed a definite tendency to 
synthesize judgments by using both relevant 
cues in Condition I, but only one of the rele- 
vant cues was used consistently. One subject 
failed to learn the task in Condition I, and 
he was dropped from the experiment. Ques- 
tioning revealed that he had failed to grasp 
the probabilistic nature of the task. Color 
happened to be the preferred cue for all sub- 
jects but one, who overused the size cue. 

In the Test Series of Condition II, five sub- 
jects (3, 4, 5, 6, and 9) switched from their 
previously preferred relevant cue (color) and 
learned to make their judgments by using 
the size cue, the only cue reinforced for these 


subjects during the training series of Condi- 
tion II. One subject (1) learned the previ- 


ously neglected cue (size) and used it in 
making judgments but continued to overuse 
his previously preferred cue (color). Another 
subject (7) used both relevant cues early in 
the test series of Condition II. Then, the color 
cue, unreinforced in the training series, was 
abandoned and the size cue was greatly over- 
used in terms of its cue validity. The only 
subject (8) who learned to operate on the 
yasis of size in Condition I continued to use 
this cue in the test series of Condition IT but 
did not use it at the level of statistical sig- 
nificance. The color cue, the only cue rein- 
forced for this subject in the training series 
f Condition II, was learned and overused. 
Thus, all eight subjects learned in Condition 
II to make judgments on the basis of a previ- 
ously neglected relevant cue. 

There was considerable variability in the 
use of the two relevant cues in Condition III. 
Five subjects (1, 5, 6, 8, and 9) gave 
evidence of having learned to combine their 
separate learning experiences and make multi- 
ple-cue judgments. Two other subjects (4 and 


some 


JUDGMENT 


rABLE 3 


COEFFICIENTS OF CuE UTILizATION FoR INDIVIDUAL 
Supyects in Conpiti0n ITI 


Test series 


Subjects 


7) used both relevant cues to making their 
judgments but tended to underplay one of 
these cues and to overuse the other. One sub- 
ject demonstrated the ability to operate with 
either of the relevant cues separately but was 
unable to combine the two in Condition ITI. 
Coefficients of cue utilization in Condition III 
for the color and size cues separately and for 
the two combined are given in Table 3. 

In this task, accuracy of judgment is de- 
fined as the percentage of times the subject’s 
judgment coincided with the experimenter’s 
calling a card X or Not-X in the training 
series. Subjects do not show the steadily ris- 
ing curve typical of a learning task, perhaps 
because of the probabilistic nature of the task 
and the relatively few trials. Subjects con- 
tinued to try out various cues, singly and in 
combinations, even after they became con- 
vinced that no cue would work perfectly. As 
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long as the training series continued, subjects 
would continue to test different hypotheses in 
an effort to improve their accuracy. Such hy- 
potheses, when found not to pay off, were 
usually abandoned, and the subject would 
then make his judgments in the following test 
series by reverting to the color and size cues, 
singly or in combination. 

Coefficients of cue utilization for all test se- 
ries were combined for the group of eight sub- 
jects. 2’ values were obtained for all r’s (Ed- 
wards, 1954), and an average taken for each 
of the test series. The mean 2’ value for each 
test series was then converted back to the ap- 
propriate value of r. For subjects having more 
than the minimum of four r’s for any one con- 
dition, pairs of zs’ values were averaged. The 
resulting four coefficients of utilization for 
each variable in each of the three conditions 
are presented graphically in Figure 1. In Con- 
dition I, the group as a whole made increas- 
ingly greater use of the color cue, while neg- 
lecting the size cue. In Test Series d, the co- 
efficients of cue utilization for color and size 
were .61 and .15, respectively. In Condition 
II, the group as a whole shifted to the size 
cue, with concomitant neglect of the color 
cue. Coefficients of cue utilization in Test 
Series d for size and color were .63 and .21, 


CONDITION I 


COEFFICIENT OF CUE UTILIZATION (r) 
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respectively. It will be recalled that for seven 
of the subjects, size was the only cue being 
reinforced in the training series, and these 
subjects switched from the originally pre- 
ferred color cue to the training cue as a basis 
for their judgments. In Condition Iil, both 
relevant cues were used at the .05 level by 
the group in judging the cards, coefficients of 
cue utilization for color and size being .40 
and .55, respectively (see Figure 1). 


Discussion 


At least two investigators (Goldberg, 1959; 
Hunt, 1959) have reported studies of clini- 
cal judgment in which nonprofessional judges 
were as successful as clinical psychologists in 
predicting from test data. In Goldberg’s study 
trainees and staff psychologists did no better 
than a group of secretaries in diagnosing or- 
ganic brain damage from the Bender Gestalt 
Test. All diagnosed above a chance level. One 
possible interpretation of Goldberg’s study is 
that formal training may actually contribute 
little to the clinician’s diagnostic skill. 

The present data indicate that subjects can 
learn to combine relevant, probable cues and 
use these cues, in terms of their cue validity, 
in making judgments. It would appear that 
learning and practice with cues taken sepa- 
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Fic. 1. Coefficients of cue utilization for color, size, and cues combined (N = 8). 
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rately are important for the later use of them 
in combination. Holt’s (1958) findings that 
his subjects were less successful than he had 
hoped in using carefully prepared manuals 
listing statistically validated cues might well 
be reversed if his judges had had some ap- 
propriate training in the use of the cues he 
provided them. Perhaps preliminary training 
trials requiring the judges to make predic- 
tions on the basis of each of the cues taken 
separately would have produced results simi- 
lar to those we have reported. 

Studies demonstrating that the judge can 
use a number of cues also suggest his limita- 
tions, as well. If the use of multiple cues can 
be facilitated by training, we must devise ef- 
fective training methods and put them to use 
before we can give any final answer to the 
basic question, how much information can the 
judge process easily and efficiently? 


EXPERIMENT II: Cur SYNTHESIS IN A 
QUASICLINICAL TASK 


Psychologists develop their biases regarding 
test data long before they are asked to par- 
ticipate in an experimental investigation of 


clinical judgment. If we could persuade our 
clinician-subject to disregard his training in- 
sofar as it inhibits his good judgment, his 
predictions might represent maximum rather 
than typical performance. Since this is not 
easy to do, there is one alternative. “Naive” 
judges, i.e., subjects with no training in in- 
terpreting test data, can be placed in a situa- 
tion analogous to that of the clinician who is 
asked to predict behavior from test scores. 

An appropriate judgment task was con- 
ceived of as follows: one in which the judge 
would predict, from test scores, a type of 
behavior involving primarily those abilities 
which the tests measured. More specifically, 
a sort of “game” was constructed using abili- 
for which suitable measures could be 
found 

Thurstone (1938) has identified a number 
of different, relatively independent mental 
abilities which together make up what is 
called intelligence. These are measured by 
the Scientific Research Associates tests of 
Primary Mental Abilities (1949). The inter- 
mediate form of this test for ages 11-17 pro- 
vides separate scores for spatial ability, rea- 


ties 


soning, word fluency, verbal meaning, and 
number ability. The PMA was administered 
to a group of University of Kentucky fresh- 
men. The low correlations between tests for 
this group of subjects indicated that the tests 
were measuring relatively independent men- 
tal abilities. On the basis of maximum scat- 
ter, three tests were selected for use in this 
study—number, reasoning, and space—and 
the game of Rocket was invented. 

Rocket is a game for one player. The sub- 
ject is asked to imagine himself a dispatcher 
in a space station somewhere above the planet 
Earth. His job is to check passengers and 
freight on every rocket leaving the station, 
and to see that each flight of three rockets 
has a squadron leader. He plays by going 
through three stacks of cards—passenger 
cards, freight cards, and rocket cards—and 
recording passengers cleared for each flight, 
freight OK’d, and squadron leaders on a tally 
sheet. 

The correct identification of passengers for 
each rocket demands spatial ability. The sub- 
ject is required to visualize a two-dimensional 
figure as it would appear if rotated in the 
same plane. All items for this part of the 
game were taken from the 1943 edition of the 
PMA and are of the same type as the space 
items appearing in the 1949 revision. In 
checking “stellar weights” on the freight 
cards, the subject must find all errors in a 
series of multiplication problems. The prob- 
lems were taken from the 1943 edition of the 
PMA. All number problems in the 1949 re- 
vision require the subject to check the addi- 
tion of columns of three two-digit figures. The 
subject identifies squadron leaders by cor- 
rectly choosing the last in a series of sym- 
bols taken from Raven’s (1958) Progressive 
Matrices Test. This requires reasoning abil- 
ity, and is similar to the letter series found 
in the PMA. 

The three PMA tests of spatial, reasoning, 
and number ability were administered to a 
total of 30 subjects, who were also asked to 
play Rocket for 20 minutes. The entire ses- 
sion required approximately an hour and a 
half. (Instructions and practice items for both 
tests and game require as much time as the 
actual testing.) Product-moment correlations 
between test and game for the subjects are 
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rABLE 4 


CORRELATIONS AND INTERCORRELATIONS BETWEEN 
PMA Tests AND ROCKET 


Nur 


Rocket score 
Space 


Number 


reported in Table 4. The PMA abilities cor- 
related with game performance to varying de- 
grees. Correlations between PMA tests were 
negligible, except perhaps for the correlation 
between space and reasoning. 

A multiple regression equation using the 
two best cues, space and reasoning, yielded a 
multiple correlation coefficient of .84 between 
actual and predicted scores for the 30 sub- 
jects. Optimal weighting of all three cues in 
a regression equation resulted in a multiple 
correlation of .88. 

This set of 30 test protocols and criterion 
measures provided the material for the judg- 
ment situation. Given test scores reflecting 
the subject’s number, space, and reasoning 


ability, the judge would be asked to predict 


how well that subject would do playing 
Rocket, a game requiring primarily those 
abilities (number, space, and reasoning) for 
which he had been tested. 


Method 


Subjects. The group of 30 judges included 9 males 
and 21 females, ranging in age from 18 to 35 years 
They were introduced to the task by being told that 
they would be asked to make a series of predictions 
on the basis of test results in the same way that 
nsychologists do. All judges were familiar with psy- 
chological tests and their purposes, but none of the 
judges had any training in testing 

Procedure. All judges followed the same procedure 
in making a total of 30 predictions in groups of 10. 
The judges began by familiarizing themselves with 
the tests and game. They read the PMA test instruc- 
tions and worked the practice problems. The game 
was explained to them, and each judge worked 
through the first unit. The scoring system for tests 
and game was explained and maximum points on 
each indicated. 

Each judge was then given a set of 10 test proto- 
cols and asked to try to rank the subjects according 
to their skill as Rocket players and predict how 
many of the total number of points possible they 
might earn. As soon as they completed their first 
series of predictions, they were given the scores 
actually earned by each of the 10 subjects in the 


group and encouraged to compare these scores with 
their predictions. 

Following their first series of 10 predictions, the 
PMA tests were administered to each judge under 
the same conditions as for the subjects. Then each 
judge played Rocket for 20 minutes. Test and game 
responses were scored immediately, and the judge 
was urged to compare the two. Then he was asked 
to predict for a second group of 10 subjects. As be- 
fore, his predictions were corrected and errors called 
to his attention. 

Before he made the final series of 10 
each judge was told the relations between tests and 
game. Correlation coefficients were reported, and 
whether the judge seemed to understand this sta- 
tistic or not, its meaning was explained to him 
Judges were told, for instance, that although space 
was the “best” predictor (subjects tended to do as 
well on the game as they did on the space test), 
reasoning was also a good indicator and should be 
considered in predicting Rocket scores. It was sug- 
gested that number was a poor predictor and should 
be given little weight in making a decision. 

After each set of 10 predictions, the judges were 
asked to describe how they had made their predic- 
tions. They were urged throughout to use any sys- 
tem they thought might maximize accuracy, and 
they were assured that there was no “correct” sys- 
tem that the experimenter expected them to dis 


cover. 


predictions, 


Results 


Hoffman (1960) and 
1955; Todd, 1954) have 
judge’s method of combining cues with a 
multiple regression equation weighting each 
variable according to its contribution to the 
judge’s predictions. This was done in the pres- 
ent study with equations fitted to each set 
of 10 predictions for all of the 30 judges. 
Table 5 summarizes the incidence of regres- 
sion coefficients significantly different from 
zero for each variable in the three conditions. 

Each group of 10 judgments may be con- 
sidered one trial for the judge, with each 


others (Hammond, 
“described” the 


TABLE § 


oF SIGNIFICANT REGRESS 


(po < O05 


INCIDENCE 


Variable 


Space 
Reasoning 
Number 


Total 
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judge having three trials in all. In 84 trials, 
the regression coefficients for the space cue 
was significantly different from zero. Num- 
ber score was used to a significant degree 
in 61 trials, but reasoning entered into the 
judges’ predictions to the same degree in only 
40 trials. Both space and number were often 
heavily weighted by the judges in making 
predictions in the first and second series of 
judgments. In the third series, however, when 
the judges had been advised that space and 
reasoning were the best indicators of game 
score, few judges continued to rely on the 
number cue. Equally few put more weight on 
reasoning, however, i.e., most judges used 
only the one best cue on their final series of 
judgments rather than the two best. 
Considering the judges individually, we 
might ask how many of them achieved an 
accurate weighting of the three variables on 
any trial. Table 6 summarizes this informa- 
tion. If we disregard the absolute size of the 
weights and their significance and consider 
only their relative size (i.e., >R>N), 
then judges accurately weighted the test 
scores in making their final series of predic- 
tions. This would not be expected by chance 
The accuracy of this group of judges in 
predicting Rocket scores can be estimated by 
comparing the mean predicted scores for each 
with actually obtained scores. The 
correlation 


subject 


produc t-moment coefficient be 


tween mean predicted scores and actual scores 


in Condition III was .82. This was significant 
beyond the .001 level (from a significance 
test using Fisher’s z transformation), as were 
the coefficients of .82 in Condition I and .79 
in Condition II. The judges as a group were 
able to make accurate predictions of the sub- 
jects’ performance in the game, but their 
judgments did not become more accurate with 
additional practice and experience with the 


criterior 


Discussion 

Because naive judges were used, it seems 
unlikely that preconceived ideas influenced 
their judgments to any significant degree 
Their initial responses were made without 
training or special instructions as to how to 
proceed. We interested in their ap- 
proach to the problem—i.e., their analysis of 


were 


rABLE 6 


NUMBER OF JUDGES ACCURATELY WEIGHTING 
VARIABLES IN Eacu JupGMENT SERIES 


Number of judges 


Q* 
Q* 


“)* 


*» <.12 


os ? < .001. 


the test-game relation—as well as in their 
method of synthesizing the information given 
them. The judges’ descriptions of the meth- 
ods they used indicated that they perceived 
the task as one requiring the use of all the 
information. The ways in which they tried to 
do this were several, and most judges pro- 
posed two or three methods, modifying their 
system as they learned more about the cri- 
terion. 

The data in Table 5 suggest that the judges 
got off to a good start, many of them correctly 
identifying at least one valid cue and actu- 
ally using two on their first trial. When they 
were told what the best cues were, however, 
and urged to use both, they tended to aban- 
don the number cue, as instructed; but in- 
stead of substituting the valid reasoning cue, 
most of them relied on the space cue alone in 
making the final series of predictions. 

The explanation for the latter behavior 
(i.e., switching to one cue for the final series 
of predictions) may be obvious. This task was 
viewed by the judges as a difficult one. It is 
probably reasonable to suppose that whatever 
their method, they were eager to find a simple 
way of arriving at an accurate prediction. Of 
course, knowing that the space score alone 
was the best predictor, ranking by space alone 
was the simplest method, and the experi- 
menter had guaranteed them a fair degree of 
accuracy using the procedure. 

Considering only the judges’ performance 
on Conditions I and II, however, it appears 
that many of them did put at least two cues 
together and come up with a fairly accurate 
prediction. 


CONCLUSIONS AND SUMMARY 


Experimental investigations of clinical judg- 
ment encounter certain methodological diffi- 
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culties. The present study describes two quasi- 
clinical situations in which naive judges were 
asked to make judgments on the basis of 
multiple cues. In both of these experimental 
situations, the criterion for accuracy of judg- 
ment could be clearly and categorically de- 
fined. It was also possible to identify the ba- 
sis for the individual judge’s prediction. 

In this context, judges sroved able to 
handle at least two relevant cues in making 
a prediction. Experiment I suggested that 
prior training with the individual cues taken 
singly enhances the judge’s ability to use both 
together when the opportunity is presented. 
Subjects in Experiment I’ »howed a tendency 
to rely on a single cue »..en they were as- 
sured of a fair degree ot accuracy in so doing. 

In the stvciy by Goldberg (1959) de- 
scribed above, an “expert” was called in and 
matched agairst bot!: clinicians and secre- 
taries in diagwsing ! domage from the 
Bender Gestait. Where ‘he others had spent 
from 45 minutes to 1 hour on the task, this 
man took about 20 hours, studying the rec- 


ords and making his predictions. He was able 


to better the record of the clinical trainees by 
correctly identifying 83% oi the records as 
organic or nonorganic. (The clinicians had 
averaged 70° correct with the best of them 
getting 77% correct.) We might ask whether 
this slight advantage was worth the extra 
hours. Making judgments on the basis of 
multiple cues is hard wo.k, and, in practice, 
the judge may find it expedient to sacrifice 
the accuracy to be achieved only by long and 
laborious pondering over test data. Certainly 
judges will be motivated to work so hard only 
if it can be demonstrated that their effort will 
result in some significant increase in accuracy. 

Neither of the studies reported here was 
designed primarily to investigate accuracy of 
judgment. Actually, both groups of judges 
were fairly accurate, but it is not possible to 
show that there was much improvement in 
performance when judges based their predic- 
tions on two or more cues rather than one. 
What these studies do indicate, however, is 
that certain conditions facilitate multiple-cue 
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prediction. It is suggested that this type of 
study can be used to investigate accuracy of 
judgment as a function of amount of informa- 
tion processed by the judge, as well as to ex- 
plore the limits of the human being in this 
respect. 
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Currently most psychiatric outpatients who 
are seen for psychotherapy in public clinics 
are scheduled for treatment once or twice a 
week for about an hour on each occasion 
(Feldman, Lorr, & Russell, 1958; Hollings- 
head & Redlich, 1958). These treatment fre- 
quencies are based primarily on tradition and 
clinical experience. There is very little sys- 
tematic research information concerning the 
efficacy of this or other patterns of interview 
frequency. Actually therapeutic progress could 
have linear or curvilinear relations with the 
number of treatments received over fixed in- 
tervals. Ordinarily, it is lack of therapist time 
and monetary considerations that tend to 
limit the number of contacts per week. How- 
ever, one expectation, widely held, is that 
therapeutic gain increases in rough propor- 
tion to the number of treatments. This belief 


receives some support in reports by Seeman 


(1954); Myers and Auld (1955); Imber, 
Frank, Nash, Stone, and Gliedman (1957); 
and Feldman et al. (1958). On the other 
hand, Clara Thompson (1950) argues that it 
is duration rather than frequency of contact 
that is the crucial variable. 

In many respects treatment frequency is 
like the dosage level of a drug. Like dosage 
level, frequency of treatment should have im- 
portant relations to treatment effects, to pa- 
tient variables such as severity and nature of 
the disorder, and to treatment variables such 
as therapist characteristics and the method of 
treatment. The present study represents an 
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effort to explore a few of these relationships. 
Its major purpose is to test the hypothesis 
that therapeutic gains resulting from indi- 
vidual psychotherapy increase with the num- 
ber of treatment interviews received over fixed 
time intervals. 


Hypothesized Therapeutic Outcomes 

The dependent variables assessed in the 
study were selected on the basis of a survey 
of the literature as to the kinds of behavior 
purported to change as a function of psycho- 
therapy. The viewpoint taken was that, as a 
result of psychotherapy, patients might be ex- 
pected to report feeling more comfortable; 
report fewer or less disturbing physical or 
psychic complaints; function more effectively 
and comfortably interpersonally; and accept 
themselves more realistically as individuals. 

The changes hypothesized may be particu- 
larized as follows: the greater the number of 
psychotherapeutic interviews within a speci- 
fied period, the greater the: 


1. Reduction in manifest anxiety 
2. Reduction in the number and severity of 
patient complaints and problems 
Increase in ego strength 
Increase in self-acceptance 
Increase in sociability 
Reduction in hostility towards others 
. Increase in independent behavior 
8. Number of specific positive behavioral 
changes reported by the therapist 
9. Level of self-awareness or understanding 


METHOD 
Study Plan 


The study design called for the random assignment 
of patients at each of seven mental hygiene clinics, 
to one of three different treatment schedules—twice 
weekly, once weekly, and once biweekly. The largest 
participating clinic randomly assigned patients to all 
three treatment frequencies. Two clinics assigned pa- 
tients to once weekly and biweekly schedules. The 
remaining four clinics distributed patients randomly 





to twice and once weekly treatment schedules. The 
contributing Veterans Administration clinics were lo 
cated in Albany, Boston, Bridgeport, Buffalo, Chi- 
cago, Denver, and Hartford. 

Because of the high incidence of dropouts it seemed 
imperative to re-evaluate the groups at the earliest 
point at which treatment effects might be manifested 
Four months was chosen by the participating clinics 
as a period within which therapeutic gains could be 
expected for the types of patients studied. A second 
re-evaluation was scheduled after 8 months of psy 
chotherapy in order to determine what additional 
gains would be demonstrated. Each patient selected 
for the study was interviewed and tested just before 
initiation of treatment and again at the end of 16 
weeks and 32 weeks of psychotherapy. For the initial 
examination, the patient was seen successively by the 
intake social worker, the intake psychiatrist, and by 
the psychologist who administered the test battery 
which ordinarily required about an hour and a half 
He was also rated by his therapist immediately after 
his first therapy interview. For the 16-week re- 
evaluation each patient was reinterviewed by the 
social worker, retested by the psychologist on an 
abbreviated test battery, and rerated by his thera- 
pist. The 32-week re-evaluations were identical with 
the 16-week re-evaluations except that the social 
worker's interview was discontinued for administra- 
tive reasons 


Sam ple 


The sample in each clinic was confined to male 
veterans with service connected psychiatric disabili- 
ties who were less than 51 years of age and without 
any present indication of brain injury. None had re- 
ceived 3 months or more of intensive psychotherapy 
within 90 days of inclusion in the study. All were 
acceptable to the clinics for “intensive” individual 
psychotherapy as subsequently defined 

The typical patient, of the 133 included, was 37 
years of age although the range was from 21 to 51 
Approximately half of all patients were high school 
or college graduates. Of those employed the average 
annual earnings was $3,500. Seventy-five percent of 
the group were employed at the time of inclusion in 
the study. The typical patient’s illness was rated as 
“moderate” on a four-point severity scale based on 
composite ratings obtained from psychiatrist, social 
worker, and therapist. 


Treatment and the Therapists 


Most psychotherapists had a psychoanalytic ori- 
entation to treatment although modified Rogerian 
and Sullivanian approaches were also represented 
The term “intensive psychotherapy” was defined in 
terms of interviews of about an hour in duration 
and treatment directed towards reorienting or chang- 
ing the patient, assisting him in modifying his per- 
sonal adjustment patterns, and aiding him in mak- 
ing more constructive use of his assets. Patients 
whose treatment would necessarily be directed at 
keeping them out of a hospital or at maintaining 
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them at present levels of adjustment were to be ex- 
cluded from the study. 

A therapist was defined as any staff member or 
trainee whom the clinic regarded as competent to 
conduct intensive psychotherapy. Seventy-five differ- 
ent therapists participated in the study. Of these, 64 
were staff members and 11 residents or tr 
Twenty-three therapists were psychiatrists, 
social workers, and 29 were clinical psychologists 
Their years of experience as therapists averaged 4.5 
years. 


Linees 
were 


Patient Measures 


The 10 patient criteria used to evaluate the changes 
hypothesized were the following: 

1. A 50-item modified version of the Manifest Anx- 
iety (MA) scale (Taylor, 1953) was used to test the 
hypothesis concerning manifest anxiety. Reports show 
that the scale correlates well with the clinical con 
cept of anxiety 

2. The Symptom Checklist was designed to evalu- 
ate the hypothesis concerning the number of patient 
complaints. Included were 20 of the most common 
complaints reported by clinic patients 

3. The 56-item Ego Strength scale (Barron, 1953), 
also taken from the MMPI, was used to test the 
hypothesis concerning ego strength. A few items con- 
cerning religion, potentially offensive, were omitted 
Barron considers the score to be an estimate of 
adaptability and personal resourcefulness; a measure 
of general capacity for personal integration 

4. A 15-item Sociability measure was taken from the 
Guilford-Zimmerman Temperament Survey (Guilford 
& Zimmerman, 1949). A high score purports to be a 
measure of liking for people and ease in making so- 
cial contacts with others 

5. The 15-item Friendliness scale, also taken from 
the Guilford-Zimmerman Temperament Survey, was 
used to assess hostility. A low score is indicative of 
hostility, belligerence, and suspicion of others. A high 
score indicates respect for others and tolerance of 
hostile acts by others. 

6. The Self-Rating scale consists of 16 five-point 
inear self-descriptive scales concerned with self-satis- 
faction. It was included as a measure of self-accept- 
ance and self-satisfaction. The lower the score the 
greater the dissatisfaction with the self 

7. The Interpersonal Checklist (ICL) (La Forge & 
Suczek, 1955; Leary, 1957) constitutes the source for 
the ICL criteria—assertiveness, cooperativeness, hos- 
tility, and dependence. The Assertiveness score was 
based on 17 ICL adjectives that a group of psy- 
chologists judged “therapeutically desirable.” The 
score is a measure of assertiveness, self-reliance, 
competitiveness, and firmness with others 

8. Another 12 ICL adjectives, also judged to de- 
scribe therapeutically dzsirable behavior, formed a 
Cooperativeness-Responsibility scale. A high score de- 
scribes a relatively modest, cooperative, and respon- 
sible individual. 

9. The Autocratic-Hostile scale is composed of 
29 adjectives in the ICL judged to describe “thera 
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peutically undesirable” behavior such as aggressive- 
ness, exploitiveness, resentment, and suspiciousness 

10. The Dependent-Docile scale consists of 30 ad 
jectives and phrases taken from the ICL which are 
descriptive of such characteristics as passivity, self- 
effacement, overdependency, excessive generosity, and 
undue deference. All items were judged to be thera- 
peutically undesirable 

In addition to the criteria of change, a number of 
test measures were included as possible predictors of 
Twenty-five statements from the F Scale 
Adorno, Frenkel-Brunswik, Levinson, & Sanford, 
1950) were used as a measure of authoritarianism 
Low scores indicate rigid adherence to conventional 
values, condemnation of persons who violate conven 
tional standards, and generalized hostility. A measure 
of interest in thinking, observing oneself and observ- 
ing others, called Reflectiveness, was obtained from 
the Guilford-Zimmerman scales. It was conjectured 
that patients responsive to psychotherapy are more 
introspective than those who are not. An adapted 50- 
item Behavior Disturbance Inventory was included as 
a measure of psychopathic tendencies (Applezweig, 
Dibner, & Osbourne, 1958). It purports to identify 
individuals who are restless, aggressive, nomadic, and 
hostile to authority. A multiple-choice vocabulary 
test and a word fluency measure were also included 


change 


as likely predictors of therapeutic response 

Coefficients of internal consistency were computed 
for each of the 15 patient and therapist criteria 
Their median value was .80 and the range was from 
68 to 89. To secure indices of test stability, pre- 
treatment scores were correlated with 4-month and 
8-month scores. The median correlation between the 
pretreatment versus 8-month scores was .78. These 
coefficients imply that the criteria are fairly homo- 
geneous and surprisingly stable 


Therapist Me 


asures 


Each therapist described his patient on a number 
of measures which were used to assess change from 
the therapist’s viewpoint 

1. A Severity of Illness 
ratings on seven four-point scales assessing charac- 
teristics related to degrees of severity of illness (Lorr, 
Holsopple, & Turk, 1956). Ratings obtained 
immediately after the first interview and again at the 
time of each re-evaluation 
2. Each therapist was asked to describe his patient 
on the ICL. The degree of agreement between the 
therapist’s description of the patient and the patient’s 
self-description was then summarized by means of a 
concomitance index called the Patient-Therapist ICL 
Index (P-T ICL). It was conjectured that the ex- 
tent of agreement would increase with progress in 
therapy and reflect the patient’s understanding of 
himself. 

3. The Change Inventory was completed by the 
therapist at the time of each re-evaluation (McNair 
& Lorr, 1960a, 1960b). The Inventory consisted of 
92 statements marked true or not true, which are 
descriptive of specific changes frequently observed in 


score was derived from 


were 


> 


psychotherapy patients. An Interview Relationship 
(IR) score is based on 24 statements descriptive of 
the patient’s participation and resistance during the 
interview. A second score of Interpersonal Changes 
(IC) is based on 42 specific changes in interpersonal 
relations judged to be therapeutically desirable. The 
third measure of Symptom Reduction (SR) is based 
on 26 observed reductions in the frequency or se- 
verity of symptoms and problems exhibited 

Therapists also rated the patient on a number of 
variables that held promise as predictors of response 
or which were needed as control variables. Each 
therapist rated his patient on four-point scales of 
Liking for the Patient and Interest in the Patient's 
Problem. He also rated the patient’s degree of Mo- 
tivation for Treatment. Finally, the therapist indi 
cated the appropriateness of the randomly assigned 
treatment frequency for the patient’s condition on a 
Suitability of Treatment Frequency scale 


Social Worker Measures 


The social worker interviewed the patient at the 
time of intake and again at the end of 16 weeks of 
treatment. Seven scales were concerned with Employ- 
ment Adjustment, four scales described Social and 
Community Adjustment while three scales dealt with 
Family Life Adjustment. Only social adjustment and 
severity of illness ratings were used in the analysis 


Method of Analysis 


The statistical model for evaluating the effect of 
increased amounts of psychotherapy on the various 
criteria was analysis of covariance within a simple 
randomized design. Final criterion mean scores were 
adjusted for initial status on the criterion being ana- 
lyzed as well as for the effects of other contro! vari- 
ables as seemed necessary. In each instance the cor- 
relations between initial criterion scores, contro] vari- 
ables, and final criterion scores were examined. As 
required the final criterion scores were adjusted by 
covariance both for initial criterion scores as well as 
for the background variables. The effect of adjust- 
ment was to provide statistical equality of the treat- 
ment groups prior to treatment and to reduce the 
size of the error term in the F test in evaluating 
final adjusted means. The patient control variables 
considered were: age, highest grade completed, an- 
nual earnings, employment status, vocabulary score, 
F Scale (authoritarianism) score, Behavior Disturb- 
ance scale score, and Word Fluency test score. Ob- 
server control variables available were: length of 
therapist experience, rated competence of therapist, 
rated liking and interest of therapist for patient, and 
rated suitability of treatment frequency 


RESULTS 
The results for the 4- and 8-month assess- 
ment periods are presented sequentially. For 


each period the differential effects of treat- 
ment frequency are first described and then 
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rABLE 1 


TREATMENT Grove MEANS 
MONTHS OF PSYCHOTHERAPY 


Initial scores controlled 
( 


AFTER Four 


Once rw 
wuskie Biweekly 

Patient measure: 

Manifest Anxiety 

Ego Strength — 

Symptom Checklist 

Self-Rating 

Sociability 

Friend! 
Therapist measure: 
of Illness 


0.14 
0.34 
10.43 
0.13 
0.08 
| 1.16 


iness 


10.12 
1.63> 
1.90% 
2.66" 


Severity 
IR 

IX 

SR 

Other measure: 

SW Severity of Illness | 20.0 
Social Adjustment | 29.5 
P-T ICL 0.41 


“sNwmwwon 


0.63 
0.53 


10.905 


29.2 
0.41 0.37 





N }45 62 26 


= 2/129) = 3.06 
4-month means are presented and tested as 
tial scores on this variable 


the treatment gains. Whenever significant dif- 
ferences between treatment frequency groups 
could not be demonstrated, tests were made 
to identify any changes from pretreatment 
status. In order to test for total treatment 
gains, all treatment groups were pooled and 
tests were made on the combined group scores. 


Treatment Frequency: Four Months 


During the first 4 months of therapy, the 
twice weekly frequency group actually at- 
tended an average of 25.5 (s = 3.8) therapy 
sessions; the once weekly group, an average 
of 14.5 (s = 2.3) sessions; and the biweekly 
group, an average of 8.6 (s = 1.6) interviews. 
The three treatment groups did not differ sig- 
nificantly when treatment began on any of 
the criteria or on predictor or background 
variables. Table 1 presents the results of tests 
for treatment effects on 13 criteria of response 
to psychotherapy. The influence of initial 
scores on the 4-month scores is controlled by 
the method of analysis of covariance except 
in the case of the Therapist Change Inven- 
tories. Since there were no initial scores on 
the three Change Inventories, differences were 
tested by one-way analysis of variance. 


None of the F tests on the criteria is sig- 
nificant at the .05 level. In addition to the 
criteria listed in Table 1, the four ICL scores 
also showed no differences at 4 months. With 
the exception of the Therapist Change Inven- 
tories, there is no evidence of a trend for the 
biweekly group to show less change than the 
two groups receiving more frequent interviews. 
The F test for SR Changes approaches sig- 
nificance (.10 > p > .05), and this result af- 
fords the only suggestion of a difference in 
response to psychotherapy as a function of 
interview frequency. Three sources of infor- 
mation—patient, therapist, and an independ- 
ent social worker—apparently concur in ob- 
serving no differences ascribable to treatment 
frequency. Thus the major hypothesis receives 
no support for the patients studied over a 4- 
month interval. 

Was there a differential dropout rate by 
treatment frequency group prior to the 4- 
month evaluation? If so, and if patients with 
different characteristics dropped out of the 
different frequency groups, a serious bias 
could have been introduced into the compari- 
sons. A chi square test indicated that there is 
no significant association between treatment 
frequency and dropping out or remaining. 
Additional analyses (not presented here) in- 
dicate that the terminators on each treatment 


TABLE 2 
CHANGES OVER Four-MONTH PERIOD FOR 
COMBINED GROUPS 


Patient measure: 
Manifest Anxiety 
Ego Strength 
Symptom Checklist 
Self-Rating 
Sociability 
Friendliness 

Therapist measure: 
Severity of Illness 

Other measure: 

SW Severity of Illness 20.9 
Social Adjustment 30.4 
P-T ICL 35.4 


ous 


rumwe 





*f.05 (one-tailed) = 1.64. 
*p <.05 


> <.01. 
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schedule also do not differ significantly in 
initial status on the criterion or predictor 
measures or on background variables such as 
age and education. 


Treatment Gains: Four Months 


Table 2 presents initial and 4-month means 
and ¢ tests on the 10 criteria which are ad- 
ministered on both occasions. There are no 
significant changes on the patient measures 
However, both therapists and social workers 
independently observe a significant decrease 
in Severity of Illness of the study sample. 
The Patient-Therapist ICL Concomitance In- 
dex also increases significantly, i.e., patient 
and therapist are more similar in describing 
the patient on an adjective checklist after 4 
months of therapy than after one interview. 
The ICL change appears due to therapists 
simply checking more adjectives in describ- 
ing a patient after knowing him 4 months 
rather than to any shift in the content of 
the patient’s self-descriptions. Thus, after 4 
months of psychotherapy, the patients in the 


study report no significant changes on the 
patient criteria. However, therapists and so- 


cial workers observe some favorable changes 
not associated with frequency of interviews. 


Treatment Frequency: Eight Months 


The 8-month analysis is based on §8 pa- 
tients who remained in the study on the origi- 
nal randomly assigned treatment frequency 
schedules. There were 16 patients in the 
twice weekly group with a mean of 50.8 
(s = 6.4) therapy sessions. Thirty patients in 
the once weekly group had a mean of 27.7 
(s = 4.1) sessions, and 12 patients in the bi- 
weekly group had a mean of 14.4 (s = 2.8) 
hours of treatment. Before starting treatment 
the groups did not differ significantly on cri- 
terion, predictor, or background variables. 
However, before considering the results of 
treatment with these 58 cases, two questions 
about the 8-month sample should be an- 
swered. Is loss of patients from the study 
systematic by treatment frequency? Is the 8- 
month study sample comparable to the group 
analyzed at the 4-month evaluation? 

A chi square test of the relation between 
treatment frequency and remaining or drop- 
ping out of the study prior to the 8-month 


evaluation was not significant although a 
slightly higher proportion of the twice weekly 
group did drop out during this time interval. 
The 75 cases lost from the study included 57 
who had completed or stopped treatment and 
18 others who remained in psychotherapy but 
were dropped from the study because their 
treatment frequency was altered for one con- 
tingency or another (e.g., employment inter- 
fered, therapist judged it imperative to change 
the frequency of interviews). 

The pretreatment status of patients re- 
maining in the study at 8 months at the as- 
signed treatment schedules differed signifi- 
cantly in certain respects from the patients 
who dropped out. The differences suggest 
principally that the 8-month sample was more 
disturbed when they began therapy; JA 
scores were higher and self-descriptions were 
less favorable. The 8-month sample was also 
an average of 2 years older, but this differ- 
ence was not quite significant. Therapists ob- 
served significantly less IC + SR Change in 
the 8-month sample during the first 4 months 
of therapy. However, an analysis of patient 
self-report measures indicated that the re- 
mainers and dropouts did not differ signifi- 
cantly in amount of change over 4 months. 
Thus patient reports did not corroborate those 
of the therapists. 

Table 3 presents the results of a compari- 
son of treatment frequency groups, after 8 
months of treatment, on 15 criteria of re- 
sponse to psychotherapy. The IC and SR 
changes correlated so highly that they were 
combined into a single measure. None of the 
differences between frequency groups are sig- 
nificant at the .05 level, nor is there much 
suggestion of any observable trend in the hy- 
pothesized direction. Patient and therapist 
measures corroborate each other in indicating 
no differences between groups over the 8- 
month treatment period. Only the difference 
between groups on the patient-therapist ICL 
approaches significance. The difference sug- 
gests that the more frequently patients and 
therapists see each other, the greater their 
agreement in describing the patient. Thus 
once more the research hypothesis that treat- 
ment effects will be greater with more fre- 
quent treatments is not supported after 8 
months of psychotherapy. 
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TABLE 3 


TREATMENT Group MEANS AFTER EIGHT 
MONTHS OF PSYCHOTHERAPY 


(Initial scores controlled 


ADJUSTED 


Treatment frequency 
Criterion ———— 
Twice | Once ~ 
, :..| Biweekly 
weekly weekly 


Patient measure 
Manifest Anxiety 
Ego Strength 
Symptom Checklist 
Self-Rating 
Sociability 
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Friendliness 

ICL: Assertiveness 

ICL: Cooperativeness 

ICL: Hostility 

ICL: Dependence 
Therapist measure 

Severity of Illness 

IR 

IC + SR 

P-T ICL 


de ~Ito 


Vv 


* Fos (df = 2/54) = 3.17. 
> Four-month scores controlle 
were no initial scores on this var 


Treatment Gains: Eight Months 


Table 4 presents initial and 8-month means 
of the combined groups on the various cri- 
teria as well as the results of ¢ tests (one- 
tailed). Of the patient measures, the Ego 
Strength scale scores increase significantly in 
the predicted direction. Patients also use sig- 
nificantly fewer Dependency adjectives in de- 
scribing themselves at the end of 8 months of 
therapy. There are no significant shifts from 
pretreatment status on the other patient cri- 
teria. As was true at 4 months, the therapists 
observe a significant decrease in Severity of 
Illness as compared with the beginning of 
therapy. In addition, therapists note signifi- 
cantly more IC + SR Changes at 8 months 
than they had observed at 4 months. Thus 
both therapists and patients provide evidence 
of improvement over initial or 4-month status 
even though this improvement does not re- 
late significantly to frequency of treatment. 


A ONE-YEAR Fottow-Up 


A high proportion of the original group was 
known to be in treatment at the end of 12 
months, although not all on the originally as- 
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signed treatment schedule. Yet it seemed use- 
ful to retest those remaining in order to as- 
sess any changes. Thus 1 year after starting 
psychotherapy, the 133 patients who had com- 
pleted at least 4 months of psychotherapy 
were retested or asked to return for a follow- 
up re-evaluation. Of 102 patients retested 
(77% of the total) 55 had remained in psy- 
chotherapy for the entire year (Ingroup), 
and 47 had completed or terminated psycho- 
therapy between 4 months and 1 year (Out- 
group). The 31 patients who did not respond 
to the follow-up request did not differ signifi- 
cantly in background characteristics from the 
re-evaluated group. 

To test for effects of treatment frequency 
at the one year follow-up, the analysis re- 
quired a grouping of patients which differed 
from the original experimental design. By oiie 
year, too few patients remained in therapy 
on the original randomly assigned treatment 
schedules for meaningful analysis by treat- 
ment frequency group. It also seemed essen- 
tial to analyze the In- and Outgroups sepa- 
rately because of different duration of treat- 
ment. Therefore, both the Ins and Outs were 
dichotomized into two groups on the basis of 
the number of interviews received. Patients 


rABLE 4 


GHT-MONTH PERI 
MBINED GROUPS 


WER ANE 


Patient measure 
Manifest Anxiety 
Ego Strength 
Symptom Checklist 
Self-Rating 
Sociability 
Friendliness 

ICL: Assertiveness 
ICL: Cooperativeness 
ICL: Hostility 
ICL: Dependence 
rerapist measure: 
Severity of Illness 
IR 


IC + SR 


TY 


N 





*tos (one-tailed) = 1.64. 
month means 





FREQUENCY OF TREATMENT 


receiving less than the median number of in- 
terviews were classed as Lows; all others were 
classed as Highs. Comparisons of High and 
Low interview groups were then made sepa- 
rately for both Ins and Outs. The actual ef- 
fect of dichotomizing was that all cases as- 
signed to twice weekly treatment fell into the 
High interview subgroups, all cases originally 
assigned to biweekly treatment fell into the 
Low interview subgroups, but the once weekly 
group patients were distributed to both groups 
depending on the number of appointments 
they actually kept. 
Ingroup Comparison 

The High group consisted of 28 patients 
seen an average of 62 interviews, with a range 
of 43-96 sessions. The Low group consisted 
of 27 patients seen an average of 29 inter- 
views, with a range from 17-42 interviews. 
The High and Low groups differed signifi- 
few variables before starting 
High interview group was 


cantly on a 
treatment. The 


younger, included more unemployed patients, 


more single men, and reported fewer symp- 
toms and complaints at the start of therapy. 
Therapists also expressed significantly greater 
interest in the types of problems presented 
by the High group. Thus it appears that the 
12-month comparison groups are not as com- 
parable on initial status as the 4- and 8- 
month comparison groups which differed very 
little. This could well be due to the fact that 
the High and Low groups were formed partly 
on the basis of the number of appointments 
the patients actually kept during the year of 
treatment 

Table 5 presents the results of tests for 
differences between the High and Low groups 
at one year. As previously, initial scores are 
controlled by analysis of covariance. Two new 
patient criteria—Social Changes and Psycho- 
logical Changes—are also listed in Table 5 
The two measures contain a total of 19 items 
taken from the therapist IC + SR measure 
and adapted for patient self-reports. As the 
names imply, one contains a set of changes in 
the social-interpersonal adjustment area, and 
the other contains changes in self-attitudes 
and psychological symptoms. None of the 
therapist measures even suggests a difference 
between Highs and Lows at one year. Only 


rABLE 5 


[REATMENT Group MEANS FOR PATIENTS 
IN THERAPY AT TWELVE MONTHS 
(Initial scores controlled ) 


ADJUSTED 


Interview frequency 


High 
Patient measure 
Manifest Anxiety 26.0 
Ego Strength 34.9 
Symptom Checklist | 7.3 
Self-Rating 48.4 
Sociability 6.9 
Friendliness 8.4 5.23° 
ICL: Assertiveness 10.1 0.17 
ICL: Cooperativeness 7.2 | 88 8.62** 
ICL: Hostility 12.0 1.34 
ICL: Dependence 14.9 1.35 
Social Changes 1.69 
Psychological Changes 0.06" 
rherapist measure 
Severity of Illness 


0.22 
1.05% 
0.49 
| 1.37 
0.00 


0.03 
0.00" 
0.00° 
0.01 


IC + SR 
P-T ICL 


* Fos (df = 1/52) = 4.04 
T'welve-month means presented and tested, as covariance 
analysis assumptions untenable 
Twelve-month means presented and tested as there were 
no initial scores on this variable 

4 Median changes listed for each group. x* for median test 
is presented as variances were heterogeneous 

*Four-month scores controlled as there were no initia 
a -? <.05 

**> < 01 
two patient measures indicate a significant 
effect of number of interviews. The High pa- 
tients scored significantly lower on the Friend 
liness scale and used significantly fewer Co- 
operativeness-Responsibility Adjectives in de 
scribing themselves. 

The labels Friendliness and Cooperative 
ness probably do not convey very well the 
nature of the changes that took place. An 
item analysis revealed that the High group 
showed reduced Friendliness because they 
more often rejected such platitudes as “It 
pays to turn the other cheek rather than to 
fight,” and more often endorsed such state- 
ments as “It is often necessary to fight for 
what is right,” and “If I resent someone’s ac- 
tions I promptly tell him so.” The reduced 
Cooperativeness scores were due to less fre- 
quent endorsement of such self-descriptions 
as “Accepts advice readily,” “kind and re- 
assuring,” and “often helped by others.” Item 
analysis, thus, suggests that the High group 
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did not become unfriendly and uncooperative 
as the test labels would suggest, but instead 
became more outspoken, assertive, independ- 
ent, and determined to protect their own in- 
terests. 


Outgroup Comparison 


The High interview group consisted of 23 
patients no longer receiving treatment, seen 
an average of 34 interviews with a range of 
23-63 sessions. The Low group consisted of 
24 patients seen an average of 16 sessions, 
with a range from 10-22 interviews. There 
was no difference in duration of treatment for 
the High and Low groups; mean length of 
therapy was 29 weeks for both groups. Tests 
to determine if the High and Low groups 
were comparable samples of patients when 
treatment began revealed only one significant 
difference on background characteristics and 
initial criterion scores. Therapists judged the 
High group to be significantly more severely 
ill than the Low group. (Means: 22.5 vs. 18.8, 
t= 2.71, p< .01.) Analysis of the seven 
items comprising the set of severity scales in- 
dicated therapists judged the High group to 
be more distressed, more suspicious of others, 
and more self-preoccupied. They rated the 
High group as moderately ill and the Low 
group as mildly-to-moderately ill. 


TABLE 6 


ApjUsTeD TREATMENT Group MEANS FOR PATIENTs 
Out or THERAPY AT TWELVE MONTHS 
(Initial scores controlled) 
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* Fos (df = 1/44) = 4.07. 

> Actual twelve-month means are presented and tested as 
there were no initial scores on this variable. 
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rABLE 7 


INITIAL AND TWELVE-Montu Test MEANS FoR 
PATIENTS IN THERAPY AT TWELVE MontTus 


Means 
Criterion = 


Initial | 12 month 

Patient measure: 
Manifest Anxiety 29.9 26.4 
Ego Strength 31.3 33.8 
Symptom Checklist 8.0 7.0 
Self-Rating 45.7 47.1 
Sociability 6.9 
ICL: Assertiveness | : 9.9 
ICL: Hostility 11.4 
ICL: Dependence : 14.3 

Therapist measure: 
Severity of Illness 


6.61>** 
4.26** 
1.94* 
1.29 
0.11 
0.75 
0.68 
1.45 


18.9 4.86** 
8.9 0.52 
16.4 3.20>* 
0.36 0.35 


IC + SR 
P-T ICL 


*ios (one-tailed) = 1.64 
> Table entry is x* for McNemar (1955) test of changes 


i test invalid because of heterogeneous variances. x*.05 (one 


tailed) = 2.71. 
* Four-month means. 
*> <.05. 
p> < 01. 


Table 6 presents the results of tests for dif- 
ferences between the High and Low Out- 
groups at one year. No therapist measures 
were available as the Outs were no longer re- 
ceiving therapy. Three of the differences on 
the patient measures are significant at p < .05. 
Two of the differences—Friendliness and Co- 
operativeness—involve the same variables in 
the same direction as found in the Ingroup 
analysis. In addition the High group patients 
showed a significantly greater reduction in 
number of Dependency Adjectives used in 
self-descriptions. The Low patients actually 
showed a slight nonsignificant increase in the 
number of such adjectives. Examples of self- 
descriptions used less frequently by the High 
interview group are obeys too willingly, shy, 
wants everyone’s love, and likes everybody. 
All three significant differences tend to con- 
firm the findings with the Ingroup, that pa- 
tients with more interviews see themselves as 
more assertive and outspoken in pursuing 
their own interests, less dependent, and less 
willing to be imposed upon by others. In ad- 
dition to the above differences, the greater 
reduction in MA approaches significance (p 
< .10) and lends support to the impression 
of greater improvement for the High inter- 
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view patients. However, no simple picture of 
improvement can be claimed for the High pa- 
tients. Both groups report about the same 
number of symptoms and complaints (Symp- 
tom Checklist) at 12 months, and the Low 
group tends to report slightly more Psycho- 
logical Changes (p < .10) since they started 
therapy. 


Ingroup Treatment Gains 


Table 7 presents tests of differences be- 
tween initial and 12-month means on those 
patient and therapist measures which were 
unaffected by interview frequency. Significant 
changes from pretreatment status occurred on 
three of the patient measures. MA scores and 
the number of reported symptoms decreased, 
while the Ego Strength scale score increased. 
Except for the unchanged Sociability scale, 
the remaining patient measures show slight 
movements in the predicted direction. Thera- 
pists corroborate the pattern of improvement 
by reporting a decrease in severity of illness 
that is significant. They also report signifi- 
cantly more interpersonal changes and symp- 
tom reductions compared with changes dur- 
ing the first 4 months of psychotherapy. No 
additional change in the interview relation- 
ship is noted after the first 4 months which 
suggests that this measure may be sensitive 
only to early changes. In summary, the pa- 
tients in treatment at the close of 1 year show 
a pattern of favorable change from initial 
status that is somewhat broader than the 
pattern at 8 months and considerably broader 
and more consistent than at 4 months. 


Outgroup Treatment Gains 


Tests were made to determine whether 
there were significant shifts from pretreat- 
ment status for the entire Outgroup on the 
variables which did not show a significant 
relation to treatment frequency. Comparisons 
of initial and 12-month means were made on 
the following variables: Ego Strength, Symp- 
tom Checklist, Self-Rating, Sociability, As- 
sertiveness, and Hostility. None of the differ- 
ences was significant. Thus, among the Out- 
group, only those patients with a high number 
of interviews present any evidence of change 
from pretreatment status. 


DISCUSSION 
Related Studies 


Imber et al. (1957) have reported a study 
similar to the present investigation in objec- 
tive. They hypothesized that patients having 
fewer and briefer sessions of psychotherapy 
would show significantly less improvement 
than patients with more and longer sessions 
over the same period of time. Patients were 
assigned at random to group psychotherapy 
for 1.5 hours once a week, to individual psy- 
chotherapy for 1 hour once a week, and to 
minimal contact therapy for .5 hour once 
every two weeks. After 6 months they report 
more improvement in group and individual 
psychotherapy patients than in minimal con- 
tact patients. Uniortunately length of inter- 
view, frequency of treatment and type of 
treatment are confounded in the study de- 
sign. It is not possible to infer which variable 
was the effective agent. 

Ends and Page (1959) claim to have dem- 
onstrated that doubling the number of group 
therapy sessions with a constant time interval 
increases therapeutic movement. Lack of sta- 
tistical or experimental control of sampling 
sources and other defects cast serious doubt 
on their findings. Cartwright (1955) and 
Taylor (1956) have independently found that 
ratings of success or improvement on closed 
cases increases with number of treatments re- 
ceived. Duration of treatment seems not to 
have been controlled in either study. Further 
these studies are concerned with closed cases 
rather than open cases as in the present re- 
port. 


Contingent Variables 


Ordinarily therapist and patient mutually 
decide how frequently treatment sessions 
should be scheduled. Yet random assignment 
of patients to schedules without regard to 
judged suitability may have effected outcome. 
The hypothesis that patients on schedules 
rated suitable by the therapist after the initial 
interview would improve more than patients 
on schedules rated unsuitable was tested on 
10 study criteria obtained at the 4 month re- 
evaluation (McNair & Lorr, 1960a). The evi- 
dence did not support the hypotheses for any 
of the criteria. Suitability ratings were, how- 
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ever, associated with therapists’ personal re- 
actions to patients. 

The factor of chronicity of illness of the 
group studied calls for some consideration. 
The patients studied were all male veterans 
who were treated for service associated psy- 
chiatric problems that date, presumably, from 
World War II or the Korean War. Illness 
thus dates back 8-15 years and may be re- 
garded as chronic. Further about half of the 
study group had been in psychotherapy once 
before, for a median period of 10 months. Yet 
a survey by Feldman et al. (1958) showed 
that the typical Veterans Administration open 
case is seen for 1.5 years. Hollingshead and 
Redlich (1958) found the median number of 
years neurotic patients had been seen in treat- 
ment was about a year for Social Class III 
and Class IV persons. It is thus not surpris- 
ing that substantial changes did not appear 
until after 8 months of treatment nor is it 
unimpressive in view of the chronicity of 
illness. 


Patient Treatment Gains 


In the absence of a control group of pa- 
tients who receive no treatment, what can be 
inferred concerning the changes over time re- 
ported by patients and therapists? There is 
scant direct experimental justification for con- 
sidering the changes obtained as effects of 
psychotherapy. However, there is much in- 
direct support for the hypothesis that the 
changes obtained are due to individual psy- 
chotherapy. Changes obtained are in predicted 
directions and are both plausible and consist- 
ent. Furthermore, the pattern of change be- 
comes broader at each assessment period. At 
the end of 4 months few significant changes 
were obtained. By the close of 8 months there 
were significant increases in Ego Strength 
scores and in ICL Dependency scores. Thera- 
pists observed a reduction in severity of ill- 
ness and a reduction in symptoms. After one 
year in treatment patients exhibited signifi- 
cant gain in Ego Strength and reductions in 
Manifest Anxiety and Symptom distress. On 
this occasion the therapists again noted a re- 
duction in severity of illness. However, most 
of the IC and SR gains were in the area of 
Interpersonal Relations. Little further reduc- 
tion in the number and severity of symptoms 
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was reported. The order of these changes is 
what one might expect. Symptoms are first 
reduced and the more complex gains in in- 
terpersonal relations come later. 

It would be difficult to explain these sys- 
tematic gains in terms of a hypothesis of 
spontaneous improvement. For example, the 
Low Outgroup failed to change although in 
treatment as long as the High Outgroup. If 
patients improved spontaneously the Low Out- 
group could also be expected to gain unless 
the number of treatments was an agent. An 
other likely argument is that patients and 
therapists have a need to justify the expendi- 
ture of time and effort in treatment and re- 
spond by reporting favorable changes. This 
argument is persuasive but it fails to account 
for the relative absence of changes after 4 
months of treatment and the limited changes 
obtained after 8 months. Why do patients 
and therapists restrict the changes to certain 
areas? A tentative inference can thus be 
made, from the internal that the 
gains reported are ascribable to psychother- 
apy in a mental hygiene clinic setting 

Apparently the nature of the measures also 
played a role in controlling patient and thera- 
pist bias. Therapists were asked to report on 
specific attitudinal and behavioral changes 
observed. If they had been asked to indicate 
the degree of global improvement observed 
the findings would have been far more im- 
pressive and probably much more biased. 
Likewise, patient gains were determined from 
status scores. If given the opportunity to re 
port how much they “improved” the changes 
would have been much larger and also more 
easily distorted. 


evidence, 


Treatment Frequency and Duration 


What may be concluded concerning the ef- 
fects of increasing the number of interviews 
per month? There is little evidence that treat- 
ment frequency effects treatment outcome for 
the range of schedules used, for the kinds of 
patients studied, and for the methods of treat- 
ment applied. The results might have been 
different if, for example, the method of treat- 
ment was psychoanalysis and the treatment 
schedule was three to five times a week. How- 
ever, this is a conjecture that has no experi- 
mental data for support. 





FREQUENCY OF TREATMENT 


The major finding is that for the condi- 
tions of this study duration of therapy, rather 
than treatment frequency, is associated with 
predicted change over an 8-month period. For 
the one-year period the basis of treatment 
effects is uncertain because the analysis is no 
longer based on randomly assigned experi- 
mentally controlled groups. The composition 
of the High and Low interview groups at 12 
months is determined, in part, by the number 
of appointments patients kept. On the other 
hand the agreement between the High In- 
group and High Outgroup in the kinds of 
changes displayed suggests that the number 
of treatments may have some influence on 
outcome. Of the two variables, duration of 
psychotherapy seems more influential than 
treatment frequency. The pattern of change 
becomes broader and more consistent at each 
assessment period. However, it should be 
noted that for the 8-month and l-year pe- 
riods data was not available from sources 
outside of therapy such as independent inter- 
viewer, spouse, friends, or relatives. Conclu- 
sions reached here, therefore, should be quali- 
fied by this limitation on sources of evidence. 


Only the data for patients out of therapy 
before the end of a year support Cartwright’s 
(1955) finding that the number of treatments 
has a nonlinear relation to therapeutic gain 


as reported by therapists. The data on pa- 
tients out of treatment suggest that a mini- 
mum number of treatments are needed to ef- 
fect a determinable change. The Low Out- 
group did not change over the one-year period 
while the High Outgroup did. Mere passage 
of time is thus not sufficient to account for 
this difference. 

The evidence suggests that change requires 
time. Perhaps trial and error testing is a pre- 
requisite for the process of growth and change. 
New ways of reacting interpersonally must 
be tested repeatedly in natural settings be- 
fore what has been learned becomes consoli- 
dated. Insights must be put into practice. The 
findings of this study suggest that traditional 
treatment frequency schedules be examined. 
If duration is the more influential variable 
for ordinary psychoanalytic type psychother- 
apy, Therapist time can be spread over more 
patients with fewer contacts and at less cost 
to patients. 


SUMMARY 


1. The hypothesis that patient changes in 
predicted directions will increase with the 
number of treatment interviews is not sup- 
ported for the first 4-month period of treat- 
ment. Patients, therapists, and social work 
interviewers agree that no differences are 
ascribable to treatment frequency. 

2. A comparison of treatment frequency 
groups after 8 months of treatment on pa- 
tient and therapist measures again fail to sup- 
port the hypothesis that the number of treat- 
ments is positively related to therapeutic gain. 

3. After 4 months of treatment both thera- 
pists and social workers observe an over-all 
decrease on severity of illness not associated 
with frequency of interviews. Social workers 
also note greater patient Social Adjustment. 

4. Following 8 months of psychotherapy 
patients increase significantly in Ego Strength 
score, as predicted. Patients also describe 
themselves with fewer Dependency Adjectives 
than at pretreatment. Neither change is asso- 
ciated with frequency of therapy interviews 

5. Following 8 months of psychotherapy 
there is a significant decrease in severity of 
illness not associated with treatment fre- 
quency. Therapists also report significantly 
more interpersonal changes and reductions in 
symptoms than for the 4-month period 

6. The High interview group patients who 
remain in psychotherapy for a year describe 
themselves as more outspoken, assertive, in- 
dependent, and determined to protect their 
own interests than the Low group patients. 

7. The High interview group patients out 
of treatment at the one-year follow-up de- 
scribe themselves as more assertive, out- 
spoken, and independent than the Low group. 
In addition, the High group shows a greater 
reduction in the use of Dependency Adijec- 
tives. 

8. As a group those patients who remain in 
psychotherapy for a year change on certain 
test variables unaffected by treatment fre- 
quency. They decrease significantly in anxiety 
(MA scale), report fewer symptoms, and in- 
crease in ego strength (Barron scale). Thera- 
pists report significant decreases in severity of 
illness, and a greater number of favorable in- 
terpersonal changes and symptom reductions 
as compared to the first 4 months of treat- 
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ment. The pattern of favorable change is 

broader than at 4 and 8 months. 

9. Of patients out of treatment at the time 
of the one-year follow-up re-evaluations, only 
the High interview group exhibits changes 
from pretreatment status. 

REFERENCES 

Apvorno, T. W., Frenxe.-Brunswik, Ese, Levrin- 
son, D. J., & Sanrorp, R. N. The authoritarian 
personality. New York: Harper, 1950 

Apptezweic, M. H., Diener, A. S., & Ossourne, 
R. L. PEAQ: A measure of psychopathic behavior. 
J. clin. Psychol., 1958, 14, 26-30 

Barron, F. An ego-strength scale which predicts re 
sponse to psychotherapy. J. consult. Psychol., 1953, 
17, 327-333. 

Cartwricut, D 
function of certain actuarial variables. J 
Psychol., 1955, 19, 357-363 

Enns, E. J., & Pace, C. W. Group psychotherapy 
and concomitant psychological change. Psychol 
Monogr., 1959, 73(10, Whole No. 480) 

FeLpMAN, R., Lorr, M., & Russet, S. B. A mental 
hygiene clinic case survey. J. clin. Psychol., 1958, 
14, 245-250. 

Guimrorp, J. P., & ZimmermMan, W. The Guilford- 
Zimmerman Temperament Survey: Manual of in- 
structions and interpretation. Beverly Hills, Calif 
Sheridan Supply, 1949 

HowimcsHeap, A. B., & Repiicn, F. C 
and mental illness: A community 
York: Wiley, 1958 

Imper, S. D., Frank, J. D., 
A. R., & Girepman, L. H 


S. Success in psychotherapy as a 
consult 


Social class 
study New 


Nasu, E. H., Stone, 
Improvement and 


Lorr, McNarr, MIcHAUX, AND RASKIN 


amount of therapeutic contact: An alternative to 
the use of no treatment controls in psychotherapy 
J. consult. Psychol., 1957, 21, 309-315 

La Force, R., & Suczex, R. The interpersonal dimen 
sion of personality: III. An interpersonal check 
list. J. Pers., 1955, 24, 94-112 

Leary, T. Interpersonal diagnosis of per 
New York: Ronald, 1957 

Lorr, M., Horsoprre, J. D., & Turk, 
A measure of severity of illness. J. clin 
1956, 12, 384-386 

McNarr, D. M., & Lorr, M 
appropriateness of psychotherapy frequency sched 
ules. J. consult. Psychol., 1960, 24, 500-506. (a) 

McNarr, D. M., & Lorr, M. Two therapist measures 
of patient change in psychotherapy. Amer. Psy 
chologist, 1960, 15, 386. (Abstract) (b) 

McNemar, Q. Psychological New 
Wiley, 1955 

Myers, J. K., & Autp, F. Some variables related to 
outcome of psychotherapy. J. clin. Psychol., 1955 
11, 51-54 

SeeMan, J. Counselor judgments of therapeutic proc- 
ess and outcome. In C. R. Rogers & Rosalind F 
Dymond (Eds.), Psychotherapy and personality 
change. Chicago: Univer. Chicago Press, 1954. Pp 
99-108 

Taytor, J. W. Relationship of success and length in 
psychotherapy. J. Psychol., 1956, 20, 332 

Taytor, Janet A. A personality scale of manifest 
anxiety. J Psychol. 1953, 48, 285- 
290. 

Tompson, Ciara. Psychoanalysis: Evaluation and 
development. New York: Hermitage House, 1950 


Mie 
Ona y 


ELIZABETH 
Psychol, 


Therapist judgments of 


York 


statistics 


consult 


abnorm. soc 


(Received March 2, 1961) 












Journal of Abnormal and Social Psychology 
1962, Vol. 64, No. 4, 293-301 


THE STIMULUS QUALITIES OF THE SCAPEGOAT’ 


LEONARD BERKOWITZ ann JAMES A. GREEN 


University of Wisconsin 


Generally speaking, most explanations of 
social prejudice are somewhat one-sided. They 
concentrate on factors either in the prejudiced 
individual or in the victimized group, but 
typically do not effectively relate the attacker 
to the attacked. The present paper will at- 
tempt to show that the object serving as the 
target for the intolerant person’s aggression 
usually has certain stimulus qualities for this 
person, and that objects not possessing these 
characteristics are less likely to be attacked. 
In dealing with ethnic prejudice, in other 
words, it is necessary to consider both the 
aggressor and the available targets. To focus 
on either alone is to give only part of the 
picture. 

Such one-sidedness is particularly apparent 
in the scapegoat theory of prejudice. The de- 
tails often vary from one writer to another, 
but all versions of this common social science 
doctrine seem agreed on at least the follow- 
ing features: Frustration generates aggressive 
tendencies, which cannot be directed against 
the actual thwarting agent because this agent 
is not visible, or is capable of retaliating with 
severely punitive actions. A needed outlet is 
then found for the pent-up aggressive “en- 
ergy” through attacks upon some innocent 
minority group. The displaced aggression is 
rationalized by blaming the minority for the 
frustrations the aggressor has experienced, 
and/or attributing undesirable characteristics 
to this group. 

Several authorities (e.g., Allport, 1954; 
Zawadski, 1948) have noted that the above 
type of theorizing leaves many important 
questions unanswered. It does not tell us, for 
example, why a particular minority is at- 
tacked when any number of groups are avail- 
able. Why are Jews more likely to be the vic- 
tim of the displaced aggression than, say, 
people of Scottish descent? Considerations 


1 This research was supported by Research Grant 
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other than that an individual is frustrated 
and unable to attack the source of his frus- 
tration obviously must be introduced in or- 
der to handle this problem. As Zawadski 
(1948) pointed out, analyses of the scape- 
goating process tend to be “pure drive” theo- 
ries, explaining the origin of the aggressive 
tendencies, but not the target selection. Other 
variables must be employed to deal with the 
choice of object for aggression. 

Many writers (e.g., Williams, 1947), bas- 
ing their reasoning on the psychoanalytic 
“energy” model of behavior, have assumed, 
almost as a matter of course, that the 
thwarted individual who is afraid to at- 
tack the actual anger instigator will aggress 
against the person least likely to harm him 
by retaliatory aggression. An aggressive out- 
let must be found and presumably is found in 
attacks upon objects who cannot inflict in- 
jury in return. The most likely target for 
displaced aggression, then, supposedly is the 
safest available target. This formulation has 
also been seriously questioned (e.g., Allport, 
1954). White and Lippitt (1960) have been 
among the most recent critics of this “safety” 
hypothesis in their latest report of their now 
classic leadership study. They observed a 
number of instances of scapegoating in the 
frustrated autocratically led groups, but 
claimed that the victims were never the weak- 
est or most passive boys in the club. The boys 
singled out for aggression in one autocratic 
group “were both boys who could hold their 
own against any of the others taken singly,” 
while in another club the scapegoat was the 
largest and heaviest boy. White and Lippitt 
suggested that these particular boys were at- 
tacked in an attempt to recover status or self- 
esteem. Frustrations, they maintained, elicit 
hostility only when they lower self-esteem. 
By aggressing against these boys who were 
fairly strong and dangerous but without be- 
ing excessively formidable, the attackers pre- 
sumably could regard themselves as strong 
and potent in their own right, and thus their 
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self-esteem supposedly would be restored (p. 
166). 

It is doubtful, however, whether status re- 
covery can satisfactorily account for every 
case of target selection in displaced aggres- 
sion. Why would attacks upon Jews provide 
a greater restoration of self-esteem than at- 
tacks upon minorities of Nordic origin? The 
solution to this problem of object choice must 
involve the stimulus properties of the various 
available objects. Along these lines, Williams 
(1947), among others, proposed that scape- 
goats frequently are visibly different or 
strange. It is the strangeness of the available 
objects that determines their likelihood of 
evoking displaced hostility. Strangeness or dif- 
ference itself supposedly is disturbing. Allport 
(1954) has been inclined to accept such a 
thesis, contending that whatever instinctive 
basis there may be for group prejudice can 
perhaps be found in the “hesitant response . . . 
human beings have to strangeness” (p. 130). 
Babies of about 6 months of age cry or be- 
come emotionally upset when a stranger draws 
near them, and such a reaction tendency may 


persist into adulthood. The present writers, 
believe, nevertheless, that strangeness is up- 
setting only under certain limiting conditions. 


Animal research (cf. White, 1959) and the 
rush of tourists to foreign countries indicate 
that strange and novel stimuli may be en- 
ticing in some circumstances. 

Fear of a stranger largely arises when the 
individual expects the unknown person to be 
potentially dangerous. If a person is afraid of 
strangers or people who are greatly different 
from himself he probably views most people 
as being dangerous; the stranger is an “ink- 
blot” eliciting the responses the person cus- 
tomarily makes to people. Ethnocentric per- 
sonalities, of course, are relatively likely to 
be antagonistic to those who are different 
(Adorno, Frenkel-Brunswik, Levinson, & San- 
ford, 1950), and evidence suggests these peo- 
ple often are fairly insecure. They supposedly 
are uncertain of themselves, the world about 
them, and their place in the world. Thus, ac- 
cording to a study by Allport and Kramer 
(1946), highly prejudiced adults are much 
more likely than their more tolerant peers to 
agree that, “The world is a hazardous place 
in which men are basically evil and danger- 
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ous.”’ Having this outlook, it is not surprising 
that they are unfriendly toward outgroups. 
Since the world is a threatening place for 
them, an unknown person from this world 
also is potentially dangerous. 

The above analysis explains the ethnocentric 
individual’s relatively strong tendency to gen- 
eralize his frustration induced aggressive tend- 
encies toward strangers (Berkowitz, 1959). 
The stranger, and more generally, the alien 
group, is somewhat threatening and, because 
it is threatening, is disliked. Dislike, we con- 
tend, mediates the generalization of aggres- 
sion. Aggression will generalize from the 
anger-instigator to another person in direct 
proportion to the degree of dislike for this 
latter individual. If this hypothesis is correct, 
we would have a means of integrating the 
scapegoat theory of prejudice with those other 
accounts of intergroup conflict focusing upon 
the characteristics of the socially victimized 
ethnic groups. Where the scapegoat theory is 
a “pure drive” theory, to employ Zawadski’s 
(1948) terminology, these latter notions— 
such as the so-called “well-earned reputation” 
doctrine—can be described as “pure stimulus” 
theories (Zawadski, 1948). Though they differ 
in important ways, these “stimulus” theories 
have at least one aspect in common; they all 
provide reasons why given minority groups 
are disliked. Putting it simply, the present 
argument holds that the aggressive tendencies 
engendered by frustrations are generalized to 
those groups whose perceived characteristics 
result in their being disliked. The individual 
may absorb his family’s or his culture’s nega- 
tive attitudes towards a particular group, or 
he may have had unpleasant experiences with 
members of this group. The genesis of the 
negative attitude is unimportant as far as we 
can see. As long as a group is disliked, what- 
ever the reason, it is a likely target for dis- 
placed aggression. 

This is not to say that particular charac- 
teristics of the minority group have no part 
in determining its likelihood of becoming a 
scapegoat. While investigators of the authori- 
tarian personality usually maintain only that 
a group is attacked merely because it is an 
“outgroup” (cf. Adorno et al., 1950, p. 233), 
they sometimes, in company with other psy- 
choanalytically oriented writers, also empha- 
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size the importance of the group’s perceived 
qualities. The prejudiced individual, for ex- 
ample, supposedly projects his own disap- 
proved sexual wishes onto Negroes because 
the stereotype of this group easily accom- 
modates such a projection, and then hates 
Negroes because of their perceived sexuality. 
Similarly, Jews are said to symbolize other 
properties the authoritarian personality un- 
consciously sees and detests in himself. 

The problem we are addressing ourselves to 
is the theoretical significance of the charac- 
teristics attributed to (or actually possessed 
by) the outgroups. Jews share few, if any, 
specific features with Negroes. In the United 
States at least, sexual qualities typically are 
not assigned to the former (Allport, 1954), 
but both groups are likely targets for the 
same prejudiced individual’s hostility. Just 
what do Jews, Negroes, and other outgroups 
have in common that results in their all being 
victimized? Most investigators of authoritari- 
anism have been too concerned with the de- 
tailed depths of the prejudiced personality to 
look for the abstract principle that effectively 
integrates the various instances of 
goating. 

The specific characteristics perceived in a 
group do four things from our point of view. 
Most important, they determine whether the 
minority is disliked and, if so, how strongly. 
This is the quality shared by the victims of 
displaced hostility. They are disliked for dif- 
ferent reasons (the Authoritarian Personality 
and other writings give us some of these rea- 
sons), but all are detested. As a result of be- 
ing the object of negative attitudes, then, 
hostility engendered by some frustration will 
generalize fairly readily to these outgroups. 
Two, the extent of hostility generalization, we 
believe, is a function of the total degree of 
association between the immediate frustrator 
and the objects available for scapegoating. 
The perceived properties of these latter ob- 
jects, as well as the dislike for them, con- 
tribute to the psychological ties the thwarted 
person can draw between his frustrator and 
the available targets. The generalized agzres- 
sive tendencies of course will not lead to overt 
attacks if the intolerant person fears he will 
be punished for displaying aggression and/or 
believes such hostility is morally improper. 


scape- 
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The outgroup characteristics also may affect 
these factors. They can determine, three, 
whether the prejudiced individual believes it 
is safe to attack a given outgroup and, four, 
whether he is ethically justified in doing this. 

The present reasoning obviously is based 
on the stimulus generalization of displace- 
ment (cf. Miller, 1948). Although our specific 
predictions do not necessarily require postu- 
lating stimulus generalization, we assume the 
negative attitudes associated with both the 
immediate frustrator and the disliked group 
give rise to a generalization continuum link- 
ing these two objects. In other words, because 
the thwarted individual makes the same im- 
plicit responses to the two objects the dis- 
liked group is associated with the frustrator. 
There may be an association between the two 
solely because they have elicited the same 
negative emotions in the individual. Another 
possibility is that the frustrated person im- 
plicitly applies the same label to both dis- 
liked objects, placing them in the same nega- 
tively evaluated category. Dollard and Miller 
(1950), in advancing the somewhat similar 
concept, “acquired equivalence of cues,” con- 
tended that a previously neutral stimulus will 
produce the response elicited by a particular 
category of stimuli after the subject has 
learned to apply the category name to this 
stimulus. But however it comes about, there 
presumably is an acquired (i.e., response in- 
duced) equivalence between the frustrator 
and the minority group which mediates the 
generalization of aggressive responses from 
the former to the latter. 

Two earlier papers by Berkowitz and 
Holmes (1959, 1960) provide evidence sup- 
porting the “dislike hypothesis.” In the first, 
findings were reported suggesting that hos- 
tility is indeed generalized from the immedi- 
ate frustrator to another stimulus person (P) 
the subject had previously learned to dislike. 
Subjects were first induced either to like or 
dislike their experimental partners (the Ps). 
After this, half of the subjects in each of 
these conditions were frustrated by the ex- 
perimenter, with the others receiving a more 
pleasant treatment from him. Then, in the 
last phase of the study, pairs of subjects 
(each subject regarding the other as P) were 
brought together again for a cooperative task. 
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It was shown that the subjects who had ex- 
pressed relatively intense hostility to the ex- 
perimenter after having been frustrated by 
him and who then were assembled with a dis- 
liked P on the final task generally expressed 
the strongest unfriendliness to this P on a 
questionnaire evaluation of him at the end of 
the session. The presumably intense hostility 
engendered by the experimenter apparently 
had generalized to the disliked P to a greater 
extent than to the more highly liked partner. 

The second experiment (Berkowitz & 
Holmes, 1960) obtained similar findings with 
stronger and more direct acts of aggression. 
Pairs of subjects were put through the same 
procedures employed in the first study, but 
this time during the last phase of the ex- 
periment each subject was given a socially 
sanctioned opportunity to administer electric 
shocks to his partner. The subject was to 
evaluate P’s performance on an assigned task 
by giving P electric shocks: one if the prod- 
uct was very good, more than this if the per- 
formance was thought to be poor. In actu- 
ality, each subject was shown the same prod- 
uct. There was the greatest increase in the 
number of shocks administered to the partner 
(in comparison to the number given during 
a baseline period at the start of the study) 
when the subject had been frustrated by the 
experimenter and then had an opportunity to 
shock the disliked P. Again it seems as if the 
aggressive tendencies evoked by the unpleas- 
ant experimenter has transferred to the un- 
pleasant P. 

However, there is at least one important 
difficulty confronting this interpretation. The 
subjects sending the greatest number of shocks 
to P had been frustrated twice: once during 
the manipulation creating the dislike to P, 
and again by the experimenter. Although in- 
ternal evidence contrary to this possibility 
was reported, it is conceivable that the rela- 
tively great amount of aggression exhibited 
by these subjects was due solely to the in- 
tense anger aroused within them by the two 
thwartings. They could have been so angry 
they would have attacked anyone strongly, 
whether this target was disliked or not. 

The present experiment seeks to test this 
alternative explanation. Essentially the same 
procedures utilized in the earlier experiments 
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are again employed. However, the subject is 
now given two people to evaluate during the 
final phase of the experiment. One of these, 
as in the first two studies, had previously been 
either friendly or unfriendly to him. The 
second person is presumably fairly neutral 
since the subject had not interacted with him 
before. If the intense anger created by the 
two thwartings was the crucial determinant 
of the previous results, there should be no dif- 
ference in the final evaluation of these two 
stimulus people, and both should be regarded 
more unfavorably after the subject is frus- 
trated twice than after the subject is thwarted 
once or not at all. On the other hand, only 
the disliked P should receive the harshest 
evaluation, and not the “neutral” stimulus 
person, after the subject is thwarted by the 
experimenter if the existing attitude towards 
the available target affects this object’s likeli- 
hood of receiving generalized aggression. 


METHOD 
Subjects 


The subjects were male students from introductory 
psychology classes at the University of Wisconsin 
who volunteered without knowing the nature of the 
experiment. Four subjects were discarded after the 
pretesting was terminated—one because he had not 
completed the final questionnaire and three who in- 
dicated they were suspicious of the treatment ac 
corded them. No more than two of these “discards” 
came from any one condition. There were 18 subjects 
in each of the four experimental conditions in the 
final sample. 


Procedure 


Two subjects who did not know each other were 
scheduled for any one experimental period. Afte: 
both had arrived at the laboratory they were joined 
by a third male posing as the third experimental sub- 
ject but who was, in actuality, the experimenter’s 
confederate. The experimenter explained the osten- 
sible purpose of the study, saying the experiment 
was to investigate the effects of stress upon crea- 
tivity. They were told each subject would first make 
a judgment of the personalities of his two partners 
(supposedly because creative people were good at 
making “snap judgments”), and then two of the 
three would work on a problem solving task under 
mild stress, while the third person would be alone 
in a natural condition as a “control.” The stress, 
they were told, would come from knowing they 
might receive several electric shocks if their per- 
formance was not too good. Each of the two people 
in the Stress condition was to work independently 
of the other on the assigned task. When they had 
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finished they were to exchange their problem solu- 
tions so that each would serve as the judge of the 
other’s performance. Each subject would transmit 
his evaluation of his partner’s product by giving 
him electric shocks. There would be one shock if 
the product was very good and more than this if it 
was thought to be inadequate. Following this, the 
experimenter said, each subject would work alone 
and he would be judged (without shocks) by the 
experimenter. Finally, in the last part of the session, 
the three men would work together on a group task 

At this time the subjects were told the shocks 
would be relatively mild and they were given an 
opportunity to withdraw from the experiment if 
they objected to receiving shocks. None did. Letters 
were then assigned to the three men, the two “real 
subjects” being called A and C, and the confederate 
B. A and C were told that they would work under 
the Stress condition, and all three men were led to 
separate rooms where the naive subjects indicated 
their first impressions of their partners on an adjec- 
tive checklist. The code letters were used in all of 
the ratings made throughout the experiment 

When the first personality evaluations were com- 
pleted, both men were given the task of designing 
a “novel, imaginative, and creative” floor plan for a 
house. Each subject was informed that he would 
have 5 minutes to work on this problem and that 
his partner was to have the same task. After 5 
minutes had passed by, the experimenter collected 
the subject’s work, strapped shock electrodes onto 
his wrist, and then showed him what was supposedly 
the partner’s (P’s) performance but actually was 
previously constructed to be standard for all condi- 
tions. Each subject was told he was to go first in 
evaluating the other’s performance by means of the 
electric shocks. He was to press a button on a nearby 
table one or more times depending upon his judg- 
ment of the other’s work. The experimenter then 
left the room ostensibly to deliver the subject's prod- 
uct to his partner. When his instruments (to which 
the shock buttons were connected) informed him 
that both subjects had given shocks, the experimenter 
administered the shocks the subjects believed came 
from their partners 


Experimental Manipulations 


The first manipulation, as in the preceding in- 
vestigations, was designed to create differences in 
initial liking for P. Subjects in the Initial Liking for 
P condition received one shock, indicating that P 
had given them the most favorable evaluation. The 
other half of the subjects, those in the Initial Dis- 
like for P condition, were given six electric shocks 
Thus, on top of whatever physical hurt they felt, 
these subjects knew P had derogated their perform- 
ance. 

Subjects worked alone on the next problem (sug- 
gesting an original idea for attracting new customers 
to a gasoline station) without exchanging papers, 
but this time the experimenter, rather than P, was 
the perceived anger instigator. Half of the subjects 
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in each of the above two conditions (those receiv- 
ing the Frustrated by the Experimenter “treatment’’) 
were criticized and insulted by the experimenter dur- 
ing this problem by standardized messages trans- 
mitted to them via earphones. The remaining sub- 
jects (in the Nonfrustrated by the Experimenter 
condition) heard a friendlier evaluation of their 
work. Following this interaction with the experi- 
menter, subjects were asked to fill out a short ques- 
tionnaire, supposedly an evaluation of psychological 
experiments to go to the Chairman of the Depart- 
ment. This form, of course, was primarily intended 
to test the success of the manipulation (Frustration 
by the Experimenter) in arousing unfriendliness to- 
ward the experimenter. 

In the third and final part of the study the two 
subjects and the confederate were brought together 
for 5 minutes to assemble a footbridge from mate- 
rials stacked in the room. The three men returned 
to their individual rooms at the end of the work 
period. Once there, the subjects completed an alter- 
nate form of adjective checklist indicating their as- 
sessments of their two peers supposedly based on 
all the information they had obtained throughout 
the experiment.2 When this was done, the experi- 
menter explained the actual purpose of the experi- 
ment and informed them of the deceptions he had 
practiced. Many of the subjects expressed a good 
deal of interest in the study and all promised not to 
talk about it to their friends. 


Dependent Variables 


The measure of each subject's attitudes toward P 
(with whom he had interacted throughout the 
study), and toward the confederate (the presumably 
neutral person) was based on the adjective check- 
list, a version of a technique used with apparent suc- 
cess in other research in the present program (eg., 
Berkowitz, 1960). As mentioned above, two alter- 
nate forms were employed, one at the beginning, 
the other at the end of the session. Each form con- 
sisted of 33 adjectives. The subjects in responding 
to these forms were to mark a True-False IBM sheet 


2The subjects also filled out a brief four-item 
scale assessing their attitudes toward each of their 
two peers. The results with this measure, not re- 
ported here because of indications that the adjec- 
tive checklist had affected these later scale responses 
in the most strongly aroused condition (cf. Berko- 
witz & Holmes, 1960), are generally consistent with 
the adjective checklist findings. Thus the Initial Dis- 
like for P-Frustrated by the Experimenter group 
was the only group experiencing some thwarting 
which gave P significantly more unfavorable ratings 
than as assigned to him in the Initial Liking for P- 
Not Frustrated by the Experimenter condition. The 
harsh treatment given the subjects by P, further- 
more, did not significantly affect the attractiveness 
of the confederate on this measure, although (as was 
also found with the adjective checklist scores) there 
was some hostility generalized to the confederate 
when the experimenter had been a frustrator. 
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(using separate IBM sheets but the same form for 
P and the confederate), stating which of the adjec- 
tives characterized the given stimulus person and 
which did not. Unknown to the subjects, a large 
group of judges had previously scaled the adjectives 
as to their overall social desirability. Thus it was 
possible to assess the level of unfriendliness in the 
subject’s judgments of each of the other two men 
by adding the number of undesirable traits attributed 
to the person to the number of favorable character- 
istics he was said not to possess. As a working as- 
sumption, high unfriendliness is taken to signify rela- 
tively intense aggressive tendencies 

Two questions embedded in a group of four were 
employed in assessing the success of the Frustration 
the Experimenter. One asked, “How much did 
you enjoy the experiment?” and the other, “How 
favorable was your reaction to the experimenter?” 
In answering each question the subjects were to 
place a mark at an appropriate position on a linear 
rating scale anchored at each end by a suitable 
phrase (e.g., “not at all’). The scores were distances 
from the favorable end of the continuum in 1 centi- 
meter units. Subjects criticized and insulted by the 
experimenter should enjoy the experiment less and 
have a more unfavorable opinion of the experimenter 
friendlier treatment from 


by 


than subjects receiving a 
him. 


RESULTS 


Success of the Experimental Manipulation 


There were no measures taken in the pres- 
ent study of the subjects’ feelings toward P 
immediately after they had obtained his first 
“evaluation” of their work. Results from the 
preceding investigations in this series (Berko- 
witz & Holmes, 1959, 1960), which employed 
virtually the same procedure, suggest, never- 
theless, that the shock “evaluations” prob- 
ably affected the level of the subjects’ un- 
friendliness toward P. The subjects in these 
earlier experiments responded to a question- 
naire within a few minutes of receiving the 
first shocks from P. Those getting the most 
shocks were more likely than the subjects 
getting one shock to indicate the P was “un- 
fair,” and to say that they had relatively 
little desire to know this person better. How- 
ever, since these first investigations employed 
a greater number of shocks in the Initial Dis- 
like for P condition, the earlier subjects re- 
ceiving this treatment may have been more 
strongly angered than the comparable sub- 
jects in the present study. 

The questionnaire ratings of the experi- 
menter obtained in this experiment clearly in- 
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dicate that his criticisms and insults in the 
Frustrated by the Experimenter condition 
had succeeded in provoking the subjects. 
Analyses of variance on each of the two rele- 
vant measures revealed significant main ef- 
fects (p < .01 in both cases) for the frustra- 
tion manipulation and no significant interac- 
tions. Thus the “aroused” subjects expressed 
reliably less enjoyment of the experiment and 
a significantly more unfavorable evaluation of 
the experimenter than the subjects receiving 
friendlier treatment from him. 


Level Unfriendliness toward P and the 


Confederate 


of 


The results obtained with the alternate 
forms of the adjective checklist administered 
at the beginning and end of the experimental 
session are summarized in Table 1. 

It can be seen that the two stimulus peo- 
ple were not equally attractive to the sub- 
jects at the start of the study; the subjects 
in each condition had a significantly more fa- 
vorable “first impression” of P than of the 
confederate. However, these initial attitudes 
apparently were only tentative and not too 
strongly held. Most subjects exhibited a con- 
siderable decline in unfriendliness to both of 
their partners by the end of the experiment. 

This decreased unfriendliness, nevertheless, 
did not occur to the same extent in all con- 
ditions. As we would expect from the earlier 
investigations, the only stimulus people not 
getting reliably more favorable evaluations 
at the end of the session were those men 
whom the subjects presumably had previ- 
ously learned to dislike and who were being 
judged by subjects frustrated by the experi- 
menter. Both of these independent variables 
had to combine in order to retard the growth 
of friendship. The subjects receiving only one 
of the harsh treatments, whether from P or 
the experimenter, showed a considerable de- 
cline in unfriendliness toward P. The table 
also indicates this impeded friendship de- 
velopment was not due simply to an accumu- 
lation of frustration effects. Subjects insulted 


8The decreased unfriendliness may have been 
merely an effect of differences in adjective checklist 
forms. We think this is unlikely since the two forms 
were quite comparable in terms of the overall fa 
vorability of the traits listed. 
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rABLE 1 


MEAN UNFRIENDLINESS TO THE 


Initial Dislike for P 


strated by the 
experimenter expertir 


Stimulus person Fr 


Time P Confederate 
First impression 9.00 1 
End of study 8.4, 

Change scores —0.6 - 


1.1, 
8.2) 
2.9 


Note the start and 


analysis 


The absolute scores obtained at 


yf variance. In the above table the higher the 1 


by both P and the experimenter still be- 
came less unfriendly toward the confederate. 
Clearly, then, the resentment aroused by the 
experimenter primarily affected the judgment 
of the disliked P. It did not interfere with 
the growth of friendship for the confederate. 
The previously learned attitude toward the 
available stimulus person affected the degree 
to which he was the victim of anger created 
by someone else. Stimulus objects the sub- 
jects had not grown to dislike (i.e., the liked 
P and the confederate) did not receive any- 
where as much generalized aggression. 

There was some hostility generalization to 
the confederate, however. As the table notes, 
this person was regarded significantly more 
favorably by those subjects who were not 
thwarted at all than by the subjects suffering 
at least one frustration. Nevertheless, these 
provocations did not retard the development 
of some friendliness toward the confederate. 
He may have been associated with the frus- 
trators, but clearly was not one of them 


DISCUSSION 


The above results, by and large, support 
the expectations underlying the present ex- 
periment. They indicate that the antagonism 
created. by one unpleasant person has a 
stronger adverse effect on the individual’s 
feelings toward another unpleasant person 
than on his attitudes toward someone he 
does not dislike.* This enhanced resentment, 
furthermore, can lead to relatively intense 

*It is important to note that these results have 


been obtained with both male and female college 
students. 


PARTNER AND TO THI 


Not Frustrated by the 
“nter 


conclusion of the study were subjected to one 
ean the 
Cells having the same subscript are not significantly different from each other by Duncan (1955 


EXPERIMENTERS’ CONFEDERATE 


Initial Liking for P 


Not Frustrated by the 
experimenter 


Frustrated by the 
experimenter 


P Confederate 


8.44. 
5.8, 
—2.6 


10.9, 
6.5, 


—4.4 


“repeated measures 
more unfriendly the attitude toward the given stimulus person 
multiple range test 


acts of hostility, as was shown in the preced- 
ing experiment in this series (Berkowitz & 
Holmes, 1960). In that study people given an 
opportunity to injure a disliked person (by 
administering electric shocks) in a socially 
sanctioned manner after having been insulted 
and criticized by the experimenter, generally 
performed more of these injurious acts than 
did the subjects also responding to a disliked 
person who had not been frustrated by the 
experimenter or the thwarted subjects given 
an opportunity to attack someone they pre- 
sumably liked. Generalizing from these find- 
ings to the area of intergroup relations, we 
can hypothesize that the ethnic groups most 
likely to become the victim of displaced ag- 
gression are those groups the frustrated peo- 
ple had come to regard as being unpleasant 
for one reason or another (that is, assuming 
these thwarted individuals interact with these 
groups or otherwise become aware of them). 

These data do not require postulating a 
stimulus induced generalization of aggressive 
tendencies from the immediate frustrator to 
the disliked individual. Many writers (e.g., 
Freud) have conceived of the aggressive drive 
as energy continually seeking an outlet. The 
attacked person supposedly merely provides 
this opportunity for release. From this point 
of view, then, we might say the aggressive 
tendencies aroused, strengthened, or released 
by the frustrating experimenter were inhibited 
when the subjects evaluated the neutral con- 
federate or the presumably liked P. The de- 
gree of aggression inhibition could have been 
in direct ratio to the attractiveness of these 
stimulus objects. Such a formulation, of 
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course, is compatible with “balance” (New- 
comb, 1959) or “dissonance-avoiding”’ ( Fest- 
inger, 1957) propositions. Dissonance (plus 
aggression-anxiety) would be aroused within 
a person if he knew he had deliberately in- 
jured someone he liked. He would have to 
suppress any inclinations he might have to 
hurt the attractive person if he is to avoid 
this dissonance (and aggression-anxiety ). 

We agree that this type of phenomenon 
probably occurs. However, we also believe, 
perhaps only as an article of faith, that the 
people the frustrated individual encounters 
after he is thwarted can serve as stimuli 
evoking aggressive responses from him. Ac- 
cording to our present reasoning the percep- 
tion of a disliked object is sufficient to elicit 
such hostility. However, it may be that only 
people who have inflicted injury on the indi- 
vidual, as was the case in this study, will 
evoke such generalized aggression, and not 
every disliked person. 

The increased unfriendliness toward the 
confederate following the frustration of the 
experimenter can perhaps also be explained 
as the result of a stimulus generalization 
process. The provoked subjects could have 
associated the confederate with the instigator 
for several reasons. Both were involved in the 
experiment; both were somewhat unpleasant 
(although, as noted earlier, the unfavorable 
attitude toward the confederate might have 
been held with little conviction); and emo- 
tion arousal seems to reduce the use of pe- 
ripheral cues, resulting in relatively gross dis- 
criminations among the available stimulus ob- 
jects—in essence, flattening and extending the 
generalization gradient (Easterbrook, 1959). 
Nevertheless, the generalization of hostility 
to the confederate in the present study was 
unexpected and further research is necessary 
to determine which, if any, of these factors 
gave rise to the generalization. 

The present argument, then, offers a rela- 
tively simple solution to the problem of tar- 
get selection in scapegoating. At least some 
of the displacement of hostility upon certain 
minority groups can be accounted for by the 
thwarted individual’s prior dislike for these 
groups. Feeling this way about them, he pre- 
sumably associates these people with his most 
recent frustrators. An industrial worker may 
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become more aggressive to the Jews in his 
community after receiving a pay cut because 
he associates the disliked Jews with the dis- 
liked factory owners and managers. 

Other variables, however, can also inter- 
vene to affect the total strength of the asso- 
ciation between ethnic group and immediate 
frustrator. This linkage may be weakened 
somewhat by knowledge forcing a discrimina- 
tion between the particular minority and the 
frustrating source. It may also be strength- 
ened by additional characteristics shared by 
the minority and the frustrator. Going back 
to the illustration of the disgruntled factory 
worker, suppose he has negative feelings to- 
wards both Turks and Jews and encounters 
a member of each of these groups. His knowl- 
edge that Turks generally are not involved in 
business management may weaken the total 
associative bond between this ethnic group 
and the factory owners. Jews, on the other 
hand, often are businessmen and, more than 
this, may also be regarded as rich and un- 
scrupulous—just like the owners of the plant. 
There are several attributes held in common 
by Jews and the immediate frustrators as far 
as this individual is concerned, resulting in a 
heightened association between these people. 
Conditioned S-R bonds may summate (Hull, 
1943, pp. 209-210). Furthermore, as we men- 
tioned earlier, the perceived properties of the 
disliked minority can lower the individual’s 
internal restraints against aggression. The 
stereotyped conception of Jews—e.g., that 
they are grasping and unscrupulous—acts to 
justify the aggressive inclinations the indi- 
vidual might feel toward this particular group. 
Believing the stereotype, he need not feel 
guilty about attacking this group. Knowing 
the group is in the minority and fairly widely 
disliked also means it is fairly safe to aggress 
against Jews. He need not fear retaliatory ag- 
gression either from the Jews in the street or 
from his peers. The consequence of all this is 
that the Jew is a more probable target for the 
thwarted worker’s hostile tendencies than is 
the Turk. 


SUMMARY 


Most explanations of social prejudice do 
not relate the prejudiced individual to the 
victimized group. They generally are either 
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“pure drive theories,” dealing only with the 
source of the aggressive tendencies, or “pure 
stimulus theories,” explaining only the char- 
acteristics of the attacked minority. The pres- 
ent paper attempts to construct a theoretical 
bridge between the aggressor and the ag- 
gressed-against. The central thesis is that ag- 
gression generalizes from the anger instigator 
to another person in direct proportion to the 
degree of dislike for this person. Hostile tend- 
encies engendered by frustrations are gener- 
alized to those groups whose perceived char- 
acteristics result in their being disliked. 

In the present study 72 college men were 
distributed evenly among four experimental 
conditions created by a 2 X 2 factorial de- 
sign. Each subject (working in pairs) was 
first induced to either like or dislike his part- 
ner (the P). After this, half of the subjects 
in each of these conditions were individu- 
ally frustrated by the experimenter, with the 
others receiving a more pleasant treatment 
from him. Then, in the last phase of the 
study, the two pair members and a neutral 
peer (actually the experimenter’s confederate ) 
were brought together to work on a coopera- 


tive task. Each subject’s evaluations of his 
partner (P) and the confederate constituted 
the dependent variables. The results indicated 
that the disliked P was the primary victim of 
the resentment aroused by the frustrating ex- 
perimenter. 
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VOLUNTEER SUBJECTS ' 
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Findings (Dittes, 1961; Schachter, 1959) of 
greater affiliative behavior among first-born per- 
sons raise the question of birth order as a selec- 
tive factor among subjects volunteering for small 
group experiments. Affiliative and related depend- 
ent tendencies may make first borns more vul- 
nerable to the appeal of a recruiter and to the 
opportunity for participation in small group ac- 
tivity, especially when participation and afhilia- 
tion appear to be guaranteed by the experimenter. 
In such contexts, volunteering itself may be un- 
derstood as an affiliative act. 

One hundred Yale freshmen were solicited in 
their dormitory rooms by a senior student for a 
small group experiment to be conducted at a later 
time. The recruiting speech asked freshmen to 
participate in a small group psychology experi- 
ment involving “a group performing a common 
task cooperatively.” 

Results appear in Table 1: 36% of first borns 
and 18% of later borns volunteered for the ex- 
periment (x* = 3.8, p= .05). Only borns showed 
an identical rate with other first borns and are 
therefore included with them in Table 1. 

In the hypothetical experiment, had it been 
conducted, 76% of the subjects would have been 
first born. This proportion is substantially greater 
(p= .10) than the 61% first borns in the Yale 
freshman population * and close to twice as great 
as the probable percentage of first borns in the 
national population, which may be estimated 
from census data as about 40%. 

Review of the birth order information of sub- 
jects recruited from freshman dormitories for 
two other small group experiments (samples of 
about 100 subjects each) shows first borns to be 
overrepresented (p < .10) among the volunteer 

1 This investigation was supported by Research 
Grant M-3857 from the National Institute of Mental 
Health, United States Public Health Service. Part of 
these results were reported to the American Psycho- 
logical Association 1961 convention in New York. 

2 The sample of 100 freshmen used in this investi- 
gation was representative of the entire freshman class 
with respect to the proportion of first borns. Of the 
freshmen found in their rooms, 61% were only- or 
first born. This is identical with the percentage of 
only- and first borns found in a 25% sample of the 
records of the entire Yale freshman class. 
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rABLE 1 
PROPORTION OF VOLUNTEER SUBJECTS 


BY BrirTH ORDER 
Volunteered Declined 


2 39 
32 


First born 
Later born 


subjects, as compared with the total Yale popu- 
lation. 

These results specify one sample bias likely in 
any small group study using volunteers. The 
common allegation that most current knowledge 
of small groups represents only the psychology 
of college freshmen and sophomores could per- 
haps be more precise by restricting it to the first- 
born freshmen and sophomores. If our interpreta 
tion is correct, this bias would operate similarly 
in those instances of compulsory participation 
for course requirements, so long as the student 
has a choice of participating in one of several 
experiments. Our interpretation is that the ap- 
peal of guaranteed interaction in a small group 
study is more likely to attract first borns 

The consequences of this bias are most serious 
when variables are being studied in which first 
borns are known to differ from later borns. This 
presumably would include the large class of vari- 
ables related to affiliation, such as dependence 
cohesiveness, attractiveness of the group, and 
such processes as suggestibility and conformity 
that are known to be related to the attractiveness 
of the group. The effects obtained with such vari- 
ables are likely to be exaggerated among volun- 
teer subjects (meaning a higher proportion of 
first borns) as compared with results obtained in 
a random sample of the population. Some .05 
levels of significance obtained with these vari- 
ables among volunteer subjects would not have 
been reached with a random sample 
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The term cognitive structure is doubly distress- 
ing to many psychologists. By cognitive, it refers 
to something not very accessible, and by structure 
it teasingly adds that this ghost is nicely organ- 
ized or articulated. The present article attempts 
to relieve some of this distress. It reports a novel 
technique for a relatively comprehensive and 
bias-free evaluation of the properties of a cog- 
nitive structure, together with an application of 
the technique that appears to vindicate the term. 

The technique may be described as the study 
of the errors subjects make when they learn a 
mapping of elements into the cognitive structure 
For the present study, it was supposed that Johns 
Hopkins undergraduates have a cognitive struc- 
ture corresponding to their four classes, fresh- 
man, sophomore, junior, and senior, and a sam- 
ple of undergraduates was required to learn a 
mapping of people into this structure. Specifically 
the subjects were taught freshman, sophomore 
junior, and senior as labels for men’s names in 
what could be characterized as a paired-associates 
verbal learning task with the names as stimuli 
and the labels as responses. 

There are two principal ways of analyzing the 
subjects’ errors in a task like this, and both of 
them were used here. One is to determine where 
in the structure the subjects learn fastest, under 
the supposition that these points are their anchors 
in the structure; the other is to study in detail 
the confusions among the various points in the 
structure, under the supposition that confusions 
are inversely related to distances between points 
(Shepard, 1958; Torgerson, 1958) 


METHOD 


The experimental method resembled that of tra- 
ditional paired-associates learning experiments. The 
subject’s task is described well by the instructions 
given him (the experimenter read the instructions 
aloud while the subject followed on his own copy): 


This is an experiment on learning. Your task 
will be to learn which of the labels “freshman,” 
“sophomore,” “junior,” or “senior” is correct for 
each person in the group whose names are typed 
on these cards 


1 This work was supported in part by Grant NSF- 
G10884 from the National Science Foundation. J. J. 
Bosley participated as a Public Health Service Re- 
search Fellow. 


Each card has a man’s name on the front, and 
on the back of the same card is the label which 
is correct for him. When I give you the cards, 
they will be arranged so that the names are up 
The pace for the experiment will be set by means 
of the device you see here. This device makes a 
sound every 3 seconds. Each time it sounds, read 
aloud the name on the top card, and tell me the 
one of the labels above that you think correctly 
goes with that person. Then pick up the card and 
turn it over, so that you can see whether you 
were right or wrong, and place the card to one 
side after you have checked your answer 

There will now be a different card before you 
on the pile. When the timer sounds again, read 
aloud the new name and give the label you think 
is right for it, checking yourself as before and 
placing the card aside on the new pile. Repeat 
these steps for each card in the original stack 

After you have gone through all of the cards in 
this way, I will shuffle them and give them back to 
you for another run-through. By going through 
the set time after time, you should eventually 
learn the proper label for every name in the set 


There were 16 names in the set, 4 with each label 
The subjects themselves were also equally divided 
according to class membership, 7 from each class, 
totaling 28. Each subject was run individually to a 
criterion of two successive trials on the entire deck 
without a mistake. When necessary, a subject was 
reminded that he had to give an answer for every 
card 


RESULTS 


The data of interest, the errors made by the 
subjects en route to learning the correct labels, 
are summarized in Table 1. This table shows the 
number of errors averaged over subjects and 
items as a function of the correct label and the 
label mistakenly given. It shows, for example 
that a name for which the correct label was 


TABLE 1 
MEAN NuMBER OF ERRORS 


Label given by subject 
Correct 
label 


Freshman | Sophomore) Junior | Senior | 


Freshman 3.48 | 
Sophomore 36 
Junior 23 3.54 | 
Senior 


z 37 


4 
7 


10.02 1 


78 3.00 | 
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freshman was miscalled sophomore a mean of 
3.48 times, and was miscalled something a mean 
of 7.94 times. 

Inspection of the right-hand column of Table 1 
shows that the subjects made the fewest errors 
on names for which the correct label was fresh- 
man, somewhat more when it was senior, slightly 
more still when it was junior, and the most when 
it was sophomore. The differences among these 
means were established as significant at the .025 
level by analysis of variance. However, such 
error rates can be influenced by response sets or 
guessing tendencies, making it important also to 
examine the bottom of the table, which shows the 
frequency with which each label was given in- 
correctly. It is apparent that freshman was given 
by mistake very little, senior more often, sopho- 
more still more often, and junior very often. 
Thus the low frequency with which subjects mis- 
called freshmen something else is not because of 
a tendency to guess freshman freely. On the con- 
trary, it is in spite of a tendency to guess fresh- 
man infrequently, as demonstrated by the fact 
that the total for the freshman column is actu- 
ally less than the total for the freshman row. 
Similarly, the relatively low frequency with which 
subjects miscalled seniors something else occurs 
despite a relatively low frequency of saying 
senior. The upshot is that the subjects clearly 
learned the freshmen fastest and the seniors next 
fastest, with the sophomores and juniors being 
substantially greater and roughly equal in diffi- 
culty. 

This conclusion is confirmed by another meas- 
ure of rate of learning, trials to criterion, which 
is somewhat less influenced by guessing tendencies. 
The mean number of trials the subjects required 
to learn to give the proper label to freshmen 
(with no subsequent errors) was 13.3; for sopho- 
meres, they required 17.5; for juniors, 16.4; and 
for seniors, 14.6. The differences among these 
means were established as significant at the .001 
level by analysis of variance. 

Inspection of the interior cells of Table 1 
yields a somewhat different view of the results. 
The rows of the table show what might be called 
response generalization gradients. A freshman is 
miscalled sophomore most, junior less, and senior 
least. A sophomore is miscalled junior more often 
than senior. A junior is miscalled sophomore more 
often than freshman. And a senior is miscalled 
junior most, sophomore less, and freshman least. 
This systematic patterning of the errors, evidence 
in itself of a cognitive structure, encourages an 
attempt to use the frequencies of errors to de- 
termine distances in the structure. There are vari- 
ous ways that this might be done. A very simple 
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procedure was chosen with the aim of obtaining 
an ordered metric (Coombs, 1950), or, more pre- 
cisely, a higher-ordered metric (Siegel, 1956) 
a ranking of the distances between the labels 
pair by pair. The procedure was to obtain the 
mean confusions of each pair of labels by adding 
together the mean errors in both directions, as- 
suming that the fewer the confusions, the greater 
the distance between the pair of labels. The mean 
confusions of freshmen and seniors is 3.50, the 
smallest such figure, indicating that the distance 
from freshman to senior is the greatest, as it 
should be. The next largest distance is from 
freshman to junior (4.97 mean confusions). Third 
is from freshman to sophomore (5.84 confu- 
sions), and a close fourth is from sophomore to 
senior (6.29 confusions). It might have been ex- 
pected that the latter two-step distance would be 
larger than the former single-step distance, but 
it is not necessary for a consistent higher-ordered 
metric that it be so. Fifth is from junior to 
senior (7.37 mean confusions), and sixth and 
smallest is the distance from sophomore to junior 
(8.31 mean confusions). This ranking of dis- 
tances is consistent and indicates that the dis- 
tance from freshman to sophomore is a very 
large one, that from sophomore to junior is very 
small, and that from junior to senior is inter- 
mediate, all the labels lying on a single dimension 
One unfortunate characteristic of Table 1 de- 
serves some discussion. The table is not as sym- 
metric as one might like. For instance, freshmen 
are miscalled sophomore substantially more often 
than sophomores are miscalled freshman. And 
these asymmetries are not entirely ascribable to 
sampling error. According to ¢ tests, the differ- 
ence between opposite entries is significant at the 
.01 level for two of the six pairs (freshman- 
sophomore and sophomore-junior). Does this im- 
ply that the distance from freshman to sopho- 
more is shorter than the distance from sopho- 
more to freshman? Not necessarily. The response 
sets or guessing tendencies which were alluded to 
earlier may play a part. It was suggested that 
the subjects guessed freshman too little and 
junior too freely. Various corrections for such 
response sets are possible. One simple procedure 
is to add a small constant to the entries in each 
column so as to bring the column totals into 
symmetry with the row totals. The necessary 
constants are .39 for the freshman column (} of 
the difference between the total for the freshman 
column and the total for the freshman row), .10 
for the sophomore column, —.63 for the junior 
column, and .14 for the senior column. It is to 
be emphasized that the aim of the correction is 
not to bring the column totals into equality— 
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they should in fact vary if there are generaliza- 
tion gradients—but to make the table more sym- 
metric about the diagonal running from upper 
left to lower right. The effect of the correction 
is to reduce the asymmetries markedly, making 
them all nonsignificant, without changing the or- 
dering of mean confusions arrived at earlier. 
This outcome tends to sustain the results of the 
direct unrectified averaging. 

The main reason for drawing an equal number 
of subjects from each class in this experiment 
was to check on any possible differential “social 
perspectives” (Hartley & Hartley, 1955; New- 
comb, 1950; Sherif & Sherif, 1956) that might 
accompany class membership—-say, a tendency 
for a subject’s own class to be an anchor, or a 
tendency for the distances to classes neighboring 
his own to be large relative to the distances be- 
tween remote classes. No substantial evidence of 
such differential social perspective emerged. How- 
ever, one subject provided striking evidence of 
another kind of personal distortion of the struc- 
ture. For this subject, freshman to junior seemed 
to be by far the shortest distance (5.75 mean 
confusions), with sophomore to junior next short- 
est (3.50 mean confusions) and freshman to 
Senior 
having 
1.25 mean confusions with freshman, 1.25 with 
junior, and 1.88 with sophomore. This wildly de- 
viant pattern, at first inexplicable, made more 
sense after the experimenter asked the subject 
what class he belonged to (a question routinely 
asked after the experimental session, as a final 
check). The subject replied that he was both a 
freshman and a junior—the university called him 
a freshman, but his classmates were juniors. He 
had flunked out two years previously and recently 
been readmitted. His predicament seems rather 
faithfully reflected in his confusions during learn- 
ing. (He was finally assigned to the junior group 
of subjects on the basis of his statement that his 
primary allegiance was to the junior class.) 


sophomore next (2.62 mean confusions) 
was very distant from the other labels 


DISCUSSION 


As a demonstration of a dimension of response 
generalization in verbal learning, this experiment 
is probably unmatched in the literature (see Un- 
derwood, 1950). The subjects were on the job, 
so to speak, treating the labels as ordered in a 
task for which the ordering was irrelevant, at the 
expense of such a familiar source of response con- 
fusion as the homophonic similarity of sophomore 
and senior. And precisely because the ordering 
showed up strongly where it was irrelevant, it is 
seen to be, for the subjects, an essential, unfor- 
gettable property of the labels, fundamental in 
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their cognitive counterpart of the social struc- 
ture. This conclusion seems in good accord with 
earlier arguments for the importance of an or- 
dering schema in social cognition (De Soto, 1960, 
1961). 

Anchoring. In like manner, because the task in 
no way demanded it, the indication that fresh- 
man and senior served as anchors attains a spe- 
cial significance. It had seemed fairly likely, in 
view of what social psychologists have said about 
people’s membership groups serving as anchors, 
that the subjects would learn most rapidly the 
names belonging in their own class. Instead, any 
such tendency disappeared in the race to learn 
the freshmen and seniors. End-anchoring was the 
winner of the impartial test. 

End-anchoring is a well-known phenomenon in 
absolute judgments (Volkmann, 1951). And, in 
the sense of better performance at the ends than 
in the middle of the ordering, end-anchoring is 
also seen in rote serial learning (Deese, 1958). 
The present end-effect strongly resembles these 
earlier end-effects, even to the detail of primacy- 
finality asymmetry—freshman being learned faster 
than senior. It would appear that end-anchoring 
is a universal phenomenon of people’s dealings 
with orderings. Surely one of the first principles 
of a theory of the cognition of social structures 
should be a statement that end-anchoring of so- 
cial orderings is to be expected. 

The concept of anchoring unfortunately lacks 
refinement—an anchor has been variously defined 
as a standard or reference point, as a particularly 
prominent part of a cognitive structure or field, 
as a part of a cognitive structure around which 
other parts are built or organized, as a particu- 
larly weighty determinant of behavior—but it is 
probable that these definitions are more comple- 
mentary than antagonistic. 

There are in the literature on social struc- 
tures various indications of end-anchoring. Sherif 
White, and Harvey (1955) found that in newly 
forming boys’ groups the top and bottom statuses 
were established first. Kahl (1957) reports that 
people agree best as to the status or prestige of 
occupations at the extremes of the occupational 
hierarchy. Homans (1950) discusses a declining 
New England community in which the social 
structure was disintegrating and there was no 
longer much consensus on social standings—ex- 
cept at the top and bottom. The authors have 
observed that if people are asked what Indian 
castes they know, they usually mention that the 
Brahmans are the highest caste and the untouch- 
ables the lowest, but are unable to go beyond 
these first essentials. 

These illustrations show that end-anchoring 
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has the effect of producing consensus among the 


viewers of a social ordering as to what might be 
called “reference groups” if that term were not 
already pre-empted. It is noteworthy that the 
strength of existing reference group theory (New- 
comb, 1950; Sherif & Sherif, 1956) lies in its 
recognition that reference groups vary, not onl) 
from one membership group to another, but also 
among individuals in a given membership group 
The point is made that consensus is lacking where 
it had been assumed to exist. But this is only one 
side of the picture and should not direct atten- 
tion altogether away from the places where con- 
sensus does exist. It has been argued that the 
widespread existence of social orderings depends 
on a consensual expectation of orderings (De 
Soto, 1960; De Soto & Kuethe, 1959). To that 
consensual base should be added end-anchoring 
which provides reference groups of a kind on 
which there is consensus, not only among the 
members of the society but among outsiders too 

End-anchoring seems to occur even when an 
ordering lacks ends, as is likely to be the case 
for orderings of people or events in time. For ex- 
ample, people everywhere tend to speak of the 
founder, the first member, of their family or 
lineage or clan, although this founder is more 
likely mythical than not (Fortes, Talk 
about evolution usually centers around the first 
and last representatives of an ordering—the first 
man vs. modern man (speculation about the 
“missing link” depends on the assumption of a 
last ape and first man), the original economic 
system vs. the ultimate one. And whoever feeds 
the evident cognitive hunger for end anchors 
of orderings, supplying the “ultimate” economic 
system, for example, is capitalizing, wittingly or 
unwittingly, on a dangerous human weakness 

Distances. The interpretation of errors during 
learning as reflecting distances between the labels 
seems at first rather divergent from their inter- 
pretation as reflecting anchoring, but the two in- 
terpretations are not incompatible, and may even 
have a logical connection. Taylor (in press) con- 
cluded that, in visual perception at least, anchors 
operate to change the psychological distances in 
their neighborhood. It is possible that freshman 
to sophomore is the largest distance because 
freshman is the principal anchor and that junior 
to senior is the second largest distance because 
senior is the second anchor. 

In any case, the interpretation of the confu- 
sions as reflecting distances, although appealing, 
should be regarded with some caution, if only 
because of the novelty of the technique. There 
is very little precedent for scaling psychologi- 
cal distances through confusions during learning 


1953). 
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(Shepard, 1958; Torgerson, 1958)—some sort of 
judgmental task has almost universally been used 
in such work—and there appears to be no prece 
dent for such an application of verbal learning 
People do sometimes speak of distances in so- 
cial structures—of big steps and little 
their social ladders this fact lends a face 
validity to a distance interpretation of confusions 
in the present data. And to the authors at least 
the distances reported for the undergraduate hier 
archy have substantial face validity. Unfortu- 
nately, casual interviews of a few subjects failed 
to provide clear independent confirmation of the 
experimentally determined distances. But then 
this failure may demonstrate only the superiority 
of the confusions technique, which eliminates any 
problems of the subjects’ interpretations of ques- 
tions about distances by not asking them any 
such questions. Indeed, the case of the freshman- 
junior subject suggests that the technique has 
possibilities as a rather sensitive and unusually 
instrument. But 


ones on 


and 


neutral and subtle diagnostic 


final evaluation of the distances obtained in this 
study must await the results of future applica- 


tions of the technique 


SUM MARY 


A paired-associates learning experiment was 
performed in which men’s names were the stimuli 
and the labels freshman, sophomore, junior, and 
senior were the responses. 

The subjects learned most rapidly to apply the 
labels freshman and senior correctly, a result that 
was interpreted as end-anchoring. The errors the 
subjects made during learning were patterned in 
such a demonstrate generalization 
gradients for each label. On the supposition that 
the frequency of confusions of labels was inversely 


way as to 


related to the psychological distance between 
them, it was determined that psychologically the 
labels fell on a single dimension, and that the 
distance from freshman to sophomore was largest 
the distance from sophomore to junior, shortest 
and the distance from junior to senior of inter- 
mediate size. 

Some discussion was given of the broader so- 
cial significance of the results, especially of the 
end-anchoring 
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The purpose of the present study is to investi- 
gate the effects of two pieces of music lying at 
exciting-calming 


separate points along a single 
dimension upon GSR 
of depressive and schizophrenic patients 
The effects of music GSR in 
have been studied by Dreher (1947) 
(1957), and Traxel and Wrede (1959) 
cently, in an as yet unpublished study by the 
present authors, neutral 
and calming music upon GSR and heart rate in 
Many factors such as 
methods of 
same 


(electrodermal response ) 
normals 
Henkin 
More re- 


upon 


he effects of exciting 


normals were investigated 
pieces of music and 
stimulation present 
as those used in the study reported here. It was 
found in the that the exciting 
music produced a decrease, the calming music an 
and the neutral change in 


experimenter 
in that study are the 


previous study 


increase music no 
electrical 

There does not appear to be any study on the 
specific question of the effects of music on GSR 
of psychotics. However, GSR has been used as a 


resistance 


1 This study was carried out with the cooperation 
of the administration and staff of the Milwaukee 
County Hospital for Mental Diseases. The authors 
wish particularly to thank Chris Buscaglia, Medical 
Director, John Liccione, Chief Psychologist, and Leo 
Muskatevc, Director of Musical Therapy 


response measure with psychotics, and the influ- 
ence of music upon some responses of psychotics 
has been studied. With reference to GSR, Paintal 
(1951) found differences normals and 
psyc hotics under threat of electrical shock but no 
difference in the presence of shock. Herr and Kob- 
ler (1953) found differences in GSR between neu- 
rotics and normals, and Enste and Meyer (1953) 
found GSR differences between various types of 
psychosis and between psychotics and normals. 
With reference to music, Gilman and Paperte 
(1949) investigated its effects upon behavioral 
changes in psychotic and nonpsychotic groups. 
They found that exciting music had no differ- 
ential effect upon the groups whereas the 
calming music had a greater effect in the ex- 
pected direction on the psychotic group than on 
the normal group. Simon, Holzberg, Alessi, and 
Garrity (1951) played eight piano pieces to nor- 
mals, schizophrenics, manics, and psychotic de- 
pressives and asked them to indicate whether 
each piece was happy, sad, or neither and whether 
they liked or disliked each piece. They found no 
significant differences between the normals and 
the psychotics in their identification of the mood 
of the music. Skelly and Haslerud (1952) played 
a number of short pieces of music to 39 female 
apathetic schizophrenic patients. The presenta- 


between 


two 
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tion of the music to the patients began with 
pieces rated as depressing, continued with a 
gradual shift to pieces rated as exciting, and con- 
cluded with a shift back to pieces rated as de- 
pressing. The patients were given such a series 
of pieces lasting 20-30 minutes daily for over a 
month. It was found that in comparison to their 
base activity level the patient showed a statisti- 
cally significant increase in activity level during 
the playing of the exciting music. No carry-over 
effect of the exciting music on activity level after 
a 6-hour period was found, thus indicating that 
the stimulating effects of the music were short- 
term. 

The present study is composed of two separate 
but related experiments. Experiment I employs 
depressive patients and Experiment II employs 
schizophrenic patients. The hypothesis being 
tested in each experiment is that the music pro- 
duces differential effects upon GSR and, more 
specifically, that the piece of music designated 
as exciting produces a decrease in electrical re- 
sistance of the skin and the piece of music desig- 
nated as calming produces an increase in elec- 
trical resistance of the skin. A change in resist- 
ance is considered as a departure from the level 
of resistance present at the start of the music 

The assumption is made here that GSR is a 
manifestation of emotional response. A decrease 
in electrical resistance is thus interpreted as be- 
ing due to an increase in emotional excitation 
while an increase in electrical skin resistance is 
interpreted as being due to a decrease in emo- 
tional excitation. The present study using GSR is 
thus conceived of as an investigation of the ef- 
fect of music upon the emotional level of psy- 
chotics. 


METHOD 
Subjects 


In Experiment I, subjects were 14 female and 4 
male patients who were diagnosed as having one of 
the following types of depressed illness: involutional, 
neurotic, manic-depressive depressed, or reactive. The 
mean age was 52 years. The 18 subjects were ran- 
domly selected from the total available hospital 
population of 45 such patients. Experimental data 
on 8 additional subjects were discarded, since within 
one week after being run they were judged by two 
clinical psychologists on the basis of individual in- 
terviews as no longer clinically depressed. 

In Experiment II, subjects were 13 female and 5 
male patients diagnosed as possessing one of the ma- 
jor types of schizophrenic illness. Patients who pre- 
sented a history of any different diagnosis were ex- 
cluded. The mean age was 48 years. The 18 subjects 
were randomly selected from the total available hos- 
pital population of 101 older patients who had been 
admitted in the previous 3 years. 


AND NOTES 


Apparatus and Material 


Selected 6-minute portions of two musical pieces, 
Dvorak’s “Fina] Movement” of the New World Sym- 
phony and Bach’s “Air for the G String” (MLS5115 
and MLS065, Columbia Record Company) were 
judged by the authors to be exciting and calming, 
respectively. As an empirical check on the authors’ 
judgment, 59 lower division undergraduates were 
asked to rate each (edited) piece on a five-point 
exciting-calming scale and on a five-point familiarity 
scale. Comparisons of the ratings by means of Wil- 
coxon’s matched-pairs signed-ranks test (Siegel, 1956) 
showed a significant difference (p< .001) between 
the two pieces in the expected direction on the excit- 
ing-calming scale. In addition, the “Air for the G 
String” was rated as the more familiar piece (> < 
001). It does not seem likely, however, that the dif 
ference in familiarity found for the college students 
would also be found for the psychotics used in this 
study. 

A tape of the selected portion of each piece was 
played on a Wollensak tape recorder over a high 
fidelity loudspeaker. GSR was measured using a psy- 
chogalvanometer (Stoelting, Model No. 24207). The 
measures consisted of reaction units read from a dial 
located on the front of the galvanometer. In the pres- 
ent experiment, a change of 12.5 reaction units from 
the zero dial reading represented a change in resist- 
ance of 1000 ohms. This galvanometer does not per 
mit the determination of a base level. Before each of 
the two pieces of music was played to each subject, 
the galvanometer was adjusted to match the (un- 
known) level of resistance of the subject, resulting 
in a dial reading of zero 


Proce dure 


In both experiments, each subject was run indi 
vidually and was exposed to both musical pieces 
Two sequences of pieces were used, namely, AB and 
BA, with each subject assigned randomly to a se- 
quence. The subject was seated in a chair facing a 
wall and surrounded on the other three sides by hos- 
pital bed screens. The experimenter and all recording 
and playing apparatus except the loudspeaker were 
located outside the screened area (about 7’ X 5’ in 
size). 

The sequence of events for each subject was as 
follows: instructions to the subject and connection 
of the galvanometer to the subject (3 minutes) ; pre- 
music measurement (1 minute); measurement with 
music playing (6 minutes); postmusic measurement 
(1 minute) ; rest period outside the screened area (4 
minutes). The entire sequence was then repeated 
with the second piece of music beginning with the 
3-minute instruction and connection period. The sub- 
jects were instructed to sit quietly and merely listen 
to the music. The number of reaction units shown 
on the galvanometer were recorded every 15 seconds, 
yielding a total of 24 readings per subject for each 
piece. 
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RESULTS 
Experiment I 

The results for the exciting and calming music 
played to the depressive patients, expressed in 
mean GSR reaction units, are summarized in 
Table 1. The mean values shown for each suc- 
cessive minute of time were derived by averaging 
the four readings obtained during any given min- 
ute for a given subject and then computing a 
grand mean for all 18 subjects. 

The main hypothesis of the experiment is that 
the exciting music produces a decrease and the 
calming music produces an increase in the elec- 
trical resistance of the skin. An increase or de- 
crease is a departure from the level of resistance 
present at the start of the music, a level desig- 
nated by zero reaction units. As shown in Table 1, 
the means for the exciting music are all positive 
(indicating a decrease in resistance) and those 
for the calming music are all negative (indicat- 
ing an increase in resistance). The significance of 
the difference of the mean number of GSR re- 
action units from a mean of zero reaction units 
was tested using the ¢ test for each period of 
elapsed time for each piece of music. As shown 
in Table 1, all the ¢ values are significant (p 
< .005 or p < .0005) except the one for the first 
minute of calming music. The hypothesis is thus 
confirmed when the exciting music is played for 
1-6 minutes and is confirmed when the calming 
music is played for 2—6 minutes 

Examination of the mean values at each time 
period indicates that after 1 minute the exciting 
music produced a decrease in electrical resist- 
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ance. After 2 minutes, an even greater decrease 
occurred, but from then on the decrease in re- 
sistance began to approach a plateau. How long 
this plateau lasts and in what direction subse- 
quent responses tend are questions needing in- 
vestigation. For the calming music, an increase 
in electrical resistance occurred after 2 minutes. 
The level of resistance remained substantially 
the same for the remaining time periods. The 
magnitude of change in resistance, disregarding 
whether the change was in the direction of an in- 
crease or decrease from zero, was much greater 
in response to the exciting than to the calming 
music. Tests of the significance of the differences 
between the two musical pieces in the magnitude 
of change from zero, disregarding direction of 
change, yielded significant ¢ values (minimum 
p < .02) at 2 through 6 minutes. Thus the re- 
sponse of the depressives to each piece of music 
differed in direction (decrease vs. increase), la- 
tency (1 vs. 2 minutes) and magnitude (large vs. 
small). 

A comparison between the results obtained for 
each piece of music shows that for the one-minute 
period the ¢ value for the difference in variances 
(Walker & Lev, 1953) is not significant while the 
t value for the difference in means is significant 
(p < .005, one-tailed test). For each of the re- 
maining five time periods, the ¢ values for differ- 
ences in variances and in means are significant 
(p < .0005). These analyses clearly demonstrate 
that the playing of the exciting and calming mu- 
sic produced a difference in the level of electrical 
resistance of the skin (relative to zero) and in 


TABLE 1 


MEANS, STANDARD DEVIATIONS, 


AND ¢ VALUES* oF CHANGES IN GSR REACTION UNITs FOR 


EIGHTEEN DEPRESSIVES 


11.19 


10.63 | 


4 34°** 


Calming 


5.4706" 


Elapsed time in minutes 


3 4 


13.33 13.94 15.94 16.43 


12.65 13.04 12.88 16.48 


4 34°** 4 41*** 5 oor 4 11°** 


—3.85 


—447 | 


—4.21 


3.18 | 4.31 


5.80°** | 4.01"** 


* Mean changes tested against a hypothesized population mean of zero (H: = 0). All tests are one-tailed, 


no < 


** > < 0005. 
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rABLE 2 


MEANS, STANDARD DEVIATIONS, AND ! VALUES* OF CHANGES IN GSR REAcTION UNITS FOR 
EIGHTEEN SCHIZOPHRENICS 


Measure 


Exciting 


4 90*** 


Calming 


*® Mean changes tested agai 


*p <.01 
* > < .005 
* > < .0005 


the consistency of the level of resistance. The re- 
sponse of the depressives to the exciting music 
was much less consistent than was their response 
to the calming music 
Experiment 11 

The results for the exciting and calming music 
played to the schizophrenic patients are sum- 
marized in Table 2. The mean value for each 
time period in Table 2 was computed in the same 
manner as in Experiment I. The main hypothesis 
of the experiment, namely, that the exciting mu- 
sic produces a decrease and the calming music 


produces an increase in the electrical resistance 
of the skin, was tested by determining the sig- 
nificance of the difference of the mean number 
of GSR reaction units from a mean of zero re- 


As shown in Table 2, all the ¢ values 
are significant (minimum p < .01) except, again, 
for the first minute of the calming music. The 
hypothesis is thus confirmed for the exciting mu- 
sic and, with one exception, for the calming music. 

As shown a significant decrease 
in electrical resistance occurred in response to 
the exciting music after 1 minute, and this de- 
crease in resistance tended to become consider- 
ably greater over time. For the calming music, a 
significant increase in resistance took place after 
2 minutes, but the level of resistance tended to 
remain the same over time. The magnitude of 
change in resistance from zero, disregarding the 
direction of change, was significantly greater 
(using the ¢ test, p< .01 at 2 through 6 min- 
utes) for the exciting than for the calming mu- 


action units 


in Table 


population 1 


Elapsed time in 1 


sic. Thus, as was found for the depressives, the 
response of the schizophrenics to the two pieces 
of music differed in direction, latency, and mag- 
nitude 

A comparison between the results for the two 
pieces of music shows that all the ¢ values for 
the differences in variances are significant (mini- 
mum p < .01) as are all of the ¢ values for the 
differences in means (p < .0005, one-tailed test) 
For all six periods of time, then, the playing of 
the exciting and calming music produced a dif- 
ference in the level of electrical resistance of the 
skin (relative to zero) and in the consistency of 
the level of resistance. As was found for the de- 
pressives, the response of the schizophrenics to 
the exciting music was considerably less consist- 
ent than their response to the calming music 


DISCUSSION 


The major finding of the two experiments is 
that the pieces of music judged by college stu- 
dents to be exciting and calming were capable of 
producing differential changes in GSR of psy- 
chotics. These same pieces of calming and excit- 
ing music have also been found by the authors 
to produce similar changes in GSR of normals 
It seems, then, that the effects of these musical 
stimuli are quite general 

A more detailed comparison between the psy- 
chotics used in this study and the normals (col- 
lege students) used in the previous study indi- 
cates that the exciting music produced a greater 
decrease in the electrical resistance of the nor- 
mals than of the psychotics while variability was 
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similar for both groups. This comparatively muted 
response of the psychotics to the exciting musi 
might be explained by their reduced emotional 
contact with reality, particularly reality capable 
of arousing them. The response of the normals 
and the psychotics to the calming music was simi- 
lar with respect to the magnitude of increase in 
electrical resistance, but the normals tended to 
be somewhat more variable in their response 

In the present study, the most pronounced dif 
ference between the results for the depressives 
and the schizophrenics lies in the greater vari- 
ability of the response of the depressives to the 
The conception of the GSR as a 
of the 


basis for a 


exciting music 


manifestation of an emotional response 


prov ides the 


subjects to the musik 


possible this difference in vari 


ability. It is 
characterized by 


interpretation ol 
assumed here that the depressive 1S 
a reduction of response to those 
emotional stimuli that are not related to the af 
lective or 
Normal efforts to 


example 


mood component of his depression 


cheer up the depressive, for 


are generally unsuccessful. It is also as- 


sumed that the exciting music is not related, on 
the whole 


pression of the 


to the affective component of the de- 
used as The 

arousing a re- 
as has been shown in this and the previ- 
authors. Observations of the 


patients subjects 


music, however, is capable of 
sponse 
ous study by the 
depressives during the playing of the exciting 
music gave the impression that some were trying 
to resist being influenced by the stimulus. The 
these observations is that 


pressives were susceptible to influence, but did 


implication o! the de- 


not want to be influenced, and therefore made an 


effort to resist being influenced. The greater vari- 
ibility of response by the depressives to the ex- 
well be due to a variation in 


citing music might 
success with which each patient was able to re 
the exciting mu- 
other hand 


in their response to 


sist being emotionally aroused by 


sic. The schizophrenics, on the were 


shown to be more consistent 
the exciting musi 

The finding in the present study that the calm- 
ing and the exciting music produced predicted 
changes in GSR serves to substantiate the find- 
ings for exciting and calming music obtained by 
1949; 


1952) using other response 


other investigators (Gilman & Paperte 
Skelly & Haslerud 
measures. These few studies do lend some limited 
support to the validity of the use of music with 
psychotics. The present study indicates the fea- 
sibility of at least temporarily modifying the gen- 
eral emotional level of depressive and schizo- 


phrenic patients. 


SUMMARY 


Two experiments, one using depressive and the 
other using schizophrenic patients, were 
ducted to test the hypothesis that calming music 
produces an increase and exciting music a de- 
crease in electrical resistance of the skin (GSR) 
In both experiments, a musical piece judged by 
college students to be exciting and another piece 
judged to be calming were played for 6 minutes 
randomly se- 


con- 


in counterbalanced order to 18 
lected depressives and to 18 randomly selected 
schizophrenic patients. Measures of GSR were 
obtained for each one of the 6 minutes during 
which the music was played. The hypothesis was 
confirmed in each experiment 

It was found for both the depressives and the 
schizophrenics that the decrease in electrical re- 
sistance due to the exciting music was of greater 
magnitude and shorter latency than the increase 
in resistance due to the calming music. Compari- 
son of the results for the two pieces of music 
within each experiment demonstrated a difference 
in the level of electrical resistance due to the 
music and in the consistency of the level of re- 
sistance. The response to the exciting music was 
less consistent than the response to the calming 
music. The changes resistance are 
interpreted as due to emotional effects produced 
by the music. The possibility is thus presented 
that music can be used to modify temporarily 
the general emotional level of depressive and 
schizophreni patients 


in electrical 
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THE EFFECT OF NEGATIVE VERBAL CUES UPON 
VERBAL BEHAVIOR '* 


JACK SANDLER ° 


Florida State University 


Within the last 5 years, a number of investi- 
gators have successfully expanded the operant 
conditioning paradigm to human activity, espe- 
cially with regard to verbal behavior. Since the 
initial observation by Greenspoon (1955) that 
verbal behavior can be manipulated by operant 
techniques, a host of subsequent studies have 
provided further evidence of the degree to which 
this phenomenon is subject to the same vari- 
ables underlying the bar pressing response in rats 
(Krasner, 1958). 

The majority of these studies, however, has 
been restricted to the use of positive secondary 
reinforcement as the independent variable. The 
cues have ranged anywhere from “Mmhmm”’ and 
“Good” to head nodding, and “paying attention” 
(Adams & Hoffman, 1960). Although there is 
good reason to believe that negative reinforce- 
ment also has relevancy in this area, this prob- 
lem had not been systematically attacked. Sal- 
zinger (1959) indicated that here, in particular, 
experimental knowledge is lacking. The value of 
conducting such an investigation would provide 
further knowledge regarding the degree to which 
operant principles could be generalized to hu- 
man behavior, as well as indicating possible prac- 
tical implications. 

For this purpose, two particular questions were 
formulated: Does a negative verbal cue have an 


1 This paper is based on a dissertation submitted 
to the Graduate School of Florida State University 
in partial fulfillment of the requirements for the 
PhD degree. The writer wishes to express his sincere 
appreciation to Joel Greenspoon, dissertation chair- 
man, and to the staff of the Veterans Administration 
Hospital, Gulfport, Mississippi, without whose gen- 
erous cooperaton this study could not have been 
undertaken. 

2Now at the Veterans Administration Hospital, 
Coral Gables, Florida. 


effect upon the amount of verbalizations in an 
operant situation? Do different schedules of the 


negative cue have different effects? 


METHOD 

Subjects 

Sixty male patients selected from a Veterans Ad- 
ministration neuropsychiatric hospital were used in 
this study. They were all residents of the acute treat- 
ment wards and ranged from 28 to 42 years of age 
The actual selection employed involved a perusal of 
the records of all patients on each ward, eliminating 
those with below average IQ. The remaining men 
were then individually requested to participate in a 
research project. The only requirements regarding 
their selection were average intelligence (IQ of 9 
as determined by the Shipley (1940) Institute 
Living Scale and cooperativeness. The latter 
operationally defined by the procedure as outlined 
below. 


of 
was 


Procedure 


Each subject was conducted individually into the 
testing room and requested to take a seat facing the 
desk. The experimenter took the seat behind him 
thus placing himself outside the visual range of the 
subject. A recorder and microphone in full view of 
the subject were then put into operation 

The response class employed in this study 
verbalizations in general. In order to provide the 
means for the manipulation of such an operant re- 
sponse without the subject’s awareness of the nature 
of the problem (Adams, 1957), the experiment was 
disguised as an attempt to investigate the validity 
of the TAT (Murray, 1943) and the Symonds’ 
(1949) Picture Story Test. To provide for this the 
following instructions were then given: 


was 


I am interested in finding out something about 
a test. These cards consist of different pictures. It’s 
supposed to be very easy for anyone to interpret 
the important point in every picture. I would like 
to know just how easy this really is. I'll show 
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these pictures to you one at a time. In each case, 
what I want you to do is tell me what you think 
is happening. I am especially interested in what 
you think the people are feeling, what events led 
up to the present situation, how things will turn 
out, and so forth. Try to use your imagination 
as best as you can. In order to let you know how 
you're doing, I'll tell you when you're off the 
track. In other words, I won't say anything unless 
you start missing the boat; but regardless of 
whether or not I say anything, keep talking until 
you're satisfied that you can think of nothing 
more to say about a picture. Then, lay the card 
down on the table and I'll hand you the next one. 
The combined cards of the two tests were ran- 
domized and were presented to each subject indi- 
vidually. Recordings were kept of the time and 
number of cues administered. The entire experimental 
procedure was divided into three continuous phases 
Phase I lasted 4 minutes and constituted an attempt 
to determine operant rate of verbal production. Since 
operant conditioning requires the presence of an 
existing response, those subjects whose initial latency 
was over 20 seconds, or whose initial verbalizations 
were less than 10 seconds, were discarded as “un- 
cooperative.” On the basis of this criterion, 11 sub- 
jects were discarded 

Phase IT constituted the Training Phase. The nega- 
tive cue employed in this study was “Unh unh.” 
Although arbitrarily selected there was strong a 
priori reason to assume its influence as a negative 
generalized reinforcer as suggested by Skinner (1953). 
Greenspoon (1955) demonstrated that it was effec- 
tive for at least one response class. This cue was ad- 
ministered by means of differential interval schedules. 
Subjects in Group 1 received the cue every 20 sec- 
onds, (FI: 20 seconds), in Group 2 every 40 seconds 
FI: 40 seconds), and in Group 3 in a variable 
interval order. Here the cue occurred anywhere from 
5 to 35 seconds around a mean of 20 seconds. Thus 
the schedules in Groups 1 and 2 represent periodic 
reinforcement, and in Group 3 aperiodic reinforce- 
ment. Subjects in Group 4 (the control group) re- 
ceived no cues during this time but were permitted to 
verbalize for the same length of time as subjects 
in Group 1. Phase II lasted until 20 such cues had 
been administered for all groups 

The last part of the experiment (Phase III) in- 
volved the withdrawal of the cue and provided an 
opportunity for all subjects to verbalize for an addi- 
tional 4 minutes 

At the conclusion of the procedure, each subject 
was asked if he thought the experimenter had in any 
way made him modify his behavior and if he could 
think of any way that his verbal rate had been 
changed. The design was thus constructed to enable 
each subject to act as his own control as well as to 
provide the means for a comparison between three 
experimental groups and a control group 


RESULTS 


In order to analyze the data systematically, 
the response measure was organized in terms of 
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TABLE 1 


Means, SDs, anp F Ratios oF NUMBER OF RESPONSES 
Propucep PER Turrty-SECOND INTERVAL FOR 
Att Turee PHASES 


Phase I Phase II Phase III 


Group 

M SD M SD M 
39.69 | 12.84 
35.82 | 14.83 
29.88 | 10.76 
36.24 | 10.67 


11.51 
12.73 
12.37 

8.96 


31.63 | 
32.14 | 
24.19 
35.30 


12.03 
12.77 
6.85 
8.38 


41.87 
37.49 
40.27 
34.86 


F ratio 1.28 7 1.51 


* Significant at or beyond .01 level. 


number of words produced per 30-second period 

Table 1 presents the mean response rate for 
each group throughout the three phases. The ex- 
perimental design provided for an evaluation of 
the data along two lines. Between-group perform- 
ances for each phase constituted one analysis; 
within-group performances from phase to phase 
constituted another analysis. 


Between-Group Comparisons 


Despite the use of a random procedure, Table 1 
reveals considerable differences in initial mean re- 
sponse rate from a low of about 35 words per 
30-second period for Group 4 to a high of about 
42 words per 30-second period for Group 1. How- 
ever, the results of an analysis of variance indi- 
cated in Table 1 reveal that these differences 
were not significant. On the other hand, the analy- 
sis of variance F ratio for the training phase data 
was significant beyond the .01 level. Furthermore 
the relative positions of the four groups had 
changed as compared to their original levels. In 
Phase II, the control group revealed the highest 
level of verbalizations, with the variable-interval 
group responding at a mean rate far below that 
of the other three groups. 

During Phase III, the differences in mean re- 
sponse rate diminished and this was reflected in 
the nonsignificant F ratio revealed in Table 1 
The three experimental group rates drew close to- 
gether and shifted upward once again. As in 
Phase I, prior to the introduction of the nega- 
tive cue, subjects in Group 1 revealed the high- 
est mean response rate. Subjects in Group 3 
however, continued to produce words at a lower 
rate than the other three groups. 

While there was evidence to assume that the 
differences revealed during the training phase 
could be attributed to the various schedules of 
the negative cue, several possibly confounding 
variables had to be isolated. One of these, initial 
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verbal facility, was adjusted for by means of an 
analysis of covariance (Edwards, 1956). Table 2 
presents the results of this analysis. The obtained 
F value was significant beyond the .01 level 
dicating that the training phase differences were 
not attributable to differences in verbal 
facility 
An 


was 


in- 


initial 


1 


additional possibly confounding variable 
the variation in number of 3 
riods experienced by the four groups 
variance F (Edwards, 1956, p. 34! 
these data was significant beyond the 
indicating that the Phase II differences 
attributable length of time in 
the training phase 

The Phase III data provided some interesting 
information despite the lack of significance be- 
tween the group differences. Using operant level 
performance (Phase I) and training level 
formance (Phase II) as predictor variables (Ed- 
wards, 1956) the Phase III data were subjected 
to analyses of covariance. The former F 
was not significant revealing little response change 
in relative positions between the four groups 
The latter analysis, however significant 
beyond the .01 level indicating that the groups 
made a considerable change in their relative po- 
sitions between Phases II and III 


-second pe- 
The « 


{ 


ratio for 
level 
were not 


to differences in 


per- 


ratio 


was far 


Within-Group Comparisons 


Inspection of Table 1 reveals that the control 
group maintained a relatively stable mean rate 
of response, fluctuating less than 2 words per 30- 
second period from phase to phase. The experi- 
mental groups, on the other hand, demonstrated 
extensive changes in performance from phase to 
phase. Considering the differences between Phase 
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I and Phase II 
about 10 words per 30-second period, Group 2 
level decreased by about 5 words per 30-second 


period, while subjects in Group 3 talked an aver- 


Group 1 response rate dropped 


age of about 16 words less per 30-second period 
With the cue withdrawn (Phase IIT) all of the 
experimental groups increased their verbal pro- 
duction rate when compared with Phase II per- 
Actually, Groups 1 and 
turned to their operant level strength 
in Group 3, however 
low their initial response level. although also re- 


almost 


Subjects 


lormances re- 


were still considerably be- 
flecting an increase over Phase II 
level 

None of the subjects was capable of verbaliz- 
ing any relationship between the experimenter’s 
cues and his own performance, nor could any ex- 


production 


press any recognized change in his rate of verbal 


production 


Dis 


USSION 


In terms of answering the two questions which 
generated the study, there is strong evidence to 
that the “Unh unh,” acted as a 
regative reinforcer (Skinner, 1953) by i 
response probability when removed from a cor 
ditioning situation. In addition, it appears 
likely that the different schedules of the negative 
cue had different effects 

With respect to the first finding, it is easily 
seen that the present experiment confirms much 
of the data derived from operant studies on sub- 
human organisms 
example, that presenting an aversive stimulus to 
an operant response will have two effects (Estes 
1944; Jenkins & Stanley, 1950; Skinner 53 


a decrease in probability when a 


indicate cue 


increasing 


also 


It is generally recognized, for 


Nit 


1 
| 
1; 


response 


TABLE 2 


ANALYSES OF COVARIANCE 


Adjusted variable Source 

Between-groups 

Within-groups 
Total 


Phase II (Adjusted for initial 
differences ) 


Phase II (Adjusted for time 
periods) 


Between-groups 
Within-groups 
Total 


Phase III (Adjusted for initial 
differences) 


Between groups 
Within groups 
Total 


Phase III (Adjusted for training 
differences ) 


Between-groups 
Within groups 
Total 


* Significant at .01 level. 


oF Puases II anp III 


561.24 
569.62 


829.47 
,686.64 
3,516.11 


3,368.63 
911.45 
,880.08 


818.29 
151.43 
$969.72 
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tered, or suppression ol the response, an increase 
in response probability when withdrawn. This is 
clearly demonstrated in the present study where 
response rates for the experimental groups di- 
minished during training, but increased when the 
cue was withdrawn during the final phase, and 
in the cases of Groups 1 and 2, almost returned 
to operant levels 

The most extensive investigation of this phe- 
nomenon has been reported by Estes (1944) 
Estes 
studies evaluating the 
upon the bar pressing response. Both his and the 
current study, however, seem to fall into the type 
of punishment situation which involves the pres- 


a series of 
shock 


conclusions were derived from 


effect of electric 


entation of a negative reinforcement. In agree- 
ment with Skinner (1953), and as in the present 
situation, he found that strength in- 
creased when the cue was withdrawn 

While these observations have been confirmed 


the theoretical implications of 


response 


numerous times 
this phenomenon have 
Recently, however, Dinsmoor (1954) formulated 


a general framework within which free response 


been largely neglected 


aversive conditioning could be explained by means 
of an “avoidance hypothesis.” Essentially, this 
principle implies that the suppressive action of 
an aversive stimulus is due to the conditioning 

which conflict with the 
original behavior modified. Such an ex- 
planation might fit the findings of the present 
Phase II, for example 
in conflicting with the chain of re- 


of avoidance reactions 


being 
study. In it is possible 
that 
actions leading to the response associated with 
the administration of the negative cue, changed 
the situation from an aversive one to a 
aversive one. Inasmuch as the experimental ar- 
rangement was probably only mildly threatening 
the response was not completely suppressed in 
favor of total silence 
merely talked at a slower rate and when the cue 
was completely withdrawn in Phase III, verbal 
rates began to accelerate, depending upon the 
schedule to which the subject had been exposed 
One can only speculate, however, regarding the 
outcome of a similar experiment wherein a pri- 


silence 


non- 


Consequently, subjects 


mary negative reinforcer (such as shock) would 


as the negative cue. This would 
“avoidance hy- 


be employed 
provide an additional test of the 
pothesis 

With respect to the question of schedule influ- 
ence, here again, a considerable mass of data has 
been accumulated, primarily on subhuman or- 
ganisms. In their review of the literature, Jen- 
kins and Stanley (1950) indicate that a continu- 
ous schedule is almost always more effective in 
training than is a partial schedule, but partialing 
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usually results in greater durability of effect 
These principles, however, are adduced largely 
from studies employing positive reinforcement 
Estes’ (1944) data suggest that these principles 
are also true when the independent variable is 
aversive in nature. 

The present study offers no opportunity to 
evaluate the difference between continuous and 
partial schedules. However, it is reasonable to 
assume that cue administration in Group 2 (every 
other 20 seconds) represents greater partialing 
than in Groups | and 3. Since the average drop 
in response rate from Phase I to Phase II for 
Group 1 was twice that of Group 2, some con- 
firmation of Estes’ (1944) findings is revealed 
This is further substantiated by comparing the 
variable-interval group performance with that of 
Group 2. Clearly, verbal rate decreased most 
markedly under this condition. Thus these data 
support the observation that the shorter the in- 
terval between cue administration, regardless of 
whether or not the cue is aversive in nature, the 
greater the effects in terms of training 

With respect to durability of effect, it is obvi- 
ous that the variable interval schedule was again 
the most effective. This appears to be due largely 
to the fact that response rate reduction for this 
group during Phase II was so great, these sub- 
jects simply did not have enough time during 
the final stage of the experimental stage to re- 
turn to their initial level of response strength 
Thus the durability of the effect apparently re- 
lates directly to the differences in Phase I] 

There seems to be little question that the re- 
inforcement ‘““Unh unh,” presented on a variable 
interval basis, represented a more effective aver- 
sive stimulus than the same cue administered 
periodically (fixed interval). It is difficult to de- 
termine why this particular arrangement should 
have had such a profound effect. There is little 
previous research with which to relate these find- 
ings, particularly in the field of punishment of a 
“free responding” operant. Azrin reports 
a study which investigated the effects of fixed 
and variable interval as well as immediate and 
nonimmediate punishment on the pigeon pecking 
response. Although Azrin’s focus was primarily 
in terms of evaluating the effects of shock asso- 
ciated with response and shock not 
with response, his findings have some bearing on 
the present study. Regarding the two conditions 
which have the most relevance for this investi- 
(fixed interval shock without response- 
shock correlation and variable interval shock 
without response-shock correlation) under the 
first condition, Azrin found a negatively acceler- 
ated pattern of response prior to the delivery of 


(1956) 


associated 


gation 
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each shock, followed by recovery. In the second 
procedure, response rate was more erratic with 
however, no consistent deviation from a uniform 
rate of responding as in the first case. Further- 
more, a comparison of the cumulative curves as 
well as the statistical data which he presents re- 
veals a lower overall rate of response for the 
aperiodic conditions than for the periodic con- 
ditions. 

These findings, then, are quite similar to those 
of the present study which revealed the greater 
effectiveness of the aperiodic schedule and re- 
flects the possibility that the relative influence 
of schedules under aversive stimuli seems to be 
opposite to the effects of positive reinforcement 
presented under the same conditions (Ferster & 
Skinner, 1957). That is, the effects of periodic 
and aperiodic punishment, whether primary or 
secondary, are analogous to the effects of periodic 
and aperiodic reward. The difference seems to be 
one of direction. 

With regard to further research in this area 
this study has demonstrated that certain operant 
principles can be generalized to verbal behavior, 
including those situations which employ variously 
scheduled aversive stimuli. As in Estes’ (1944) 
and Azrin’s (1956) studies, the immediate effect 
of such an arrangement was a depression in the 
rate of response, i.e., number of words produced 
per 30-second interval. Furthermore, this effect 
differed depending upon the particular schedule 
employed, and was maintained as long as the 
negative reinforcer was administered. Discon- 
tinuation of the aversive stimulus resulted in al- 
most complete recovery of response rate in two 
of the groups and in partial recovery by the 
third group. Thus in those cases where the elimi- 
nation of verbal behavior is an objective, the 
present study suggests certain limitations in the 
employment of a secondary, aversive reinforce- 
ment. 


SUMMARY 


This study investigated the effect of a nega- 
tive verbal cue upon verbal rate in a projective 


test-like situation. Sixty hospitalized patients 
were randomly assigned to one of four groups 
for the purpose of administering different sched- 
ules for the verbal cue: 20-second interval, every 
other 20-second interval, variable interval, and 
control. The stimulus materials were the TAT 
and the Symonds Picture Story Test. Each group 


CRITIQUE AND NOTES 


experienced three continuous phases. During the 
first 4 minutes verbal responses for the four 
groups were recorded and no differential treat- 
ment occurred. Following this, subjects in the 
experimental groups were exposed to 20 negative 
cues at the predetermined rate. When this cri- 
terion had been achieved, verbal responses were 
recorded for an additional 4 minutes with the 
cue withdrawn for all four groups. 

It was found that the verbal cue employed in 
this study (“Unh unh’’) acted as a negative re- 
inforcer. The influence of the different schedules 
were also revealed, with the most profound effect 
upon training and durability attributable to the 
variable interval schedule. 
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YEASAYERS AND NAYSAYERS: 


A VALIDATING STUDY 


IRWIN MAHLER 


Occidental College 


In a recent article, Couch and Keniston (1960) 
discussed the rationale and the construction of 
the Agreement Response Scale (ARS), designed 
to measure an agreeing response tendency which 
was hypothesized to be a manifestation of a sig- 
nificant underlying personality syndrome. Using 
an original Over-all Agreement Score (OAS) 
these investigators selected a group of 10 Yea- 
sayers (high agreement scores) and a group of 
11 Naysayers (low agreement scores) for clini- 
cal study. The clinical findings revealed that Yea- 
sayers were individuals with weak ego control 
who seemed to have strong external orientations, 
being primarily responsive to group values and 
demands. Naysayers, on the other hand, were 
more internally oriented, more introverted, and 
exhibited greater capacity to inhibit and sup- 
press impulses. 

The present investigation may be considered 
an attempt to establish the construct validity of 
the Agreement Response Scale. If a group of in- 
dividuals could be found with strong group ori- 
entation, including acceptance of group values 
and demands, it might be expected that they 
would score higher on the ARS than would indi- 
viduals lacking this group orientation. It seemed 
reasonable that members of college sororities and 
fraternities could be called group oriented, and 
thus could constitute one known group. However 
because such group loyalty could be a conse- 
quence of membership rather than a cause of it 
it appeared more desirable to conduct this study 
with subjects not yet members of these groups. 
Two hypotheses which might easily be tested 
were formulated: 

1. College students who indicate a desire to 
join fraternities and sororities (Rushees) will 
have higher ARS scores than will those students 
who do not desire to join these organizations 
(Nonrushees ). 

2. Of those who rush, those who receive bids 
(Pledges), that is, those who are offered mem- 
bership in the fraternities and sororities, will 
have higher ARS scores than those who do not 
receive bids (Rushees-No Bid). 


METHOD 


The subjects for this study were 219 men and 
163 women, all Occidental College freshmen. During 


the last week prior to the offering of bids by the 
fraternities and sororities, the ARS scale was ad- 
ministered in the various discussion groups of a re- 
quired course for freshmen. The total of 382 fresh- 
men who completed the scale represented almost the 
complete population of freshmen (395) in the col- 
lege. Through the cooperation of the offices of the 
Deans of Men and Women it was possible to obtain 
the names of those frehmen who had signed up for, 
and began, the rushing procedure. There were 72 
freshmen women and 104 freshmen men who could 
be designated Rushees, leaving 91 women and 115 
men to be designated Non-rushees. The following 
week it was possible to obtain the names of the 30 
freshmen women and 71 men who received bids; 
they were designated as the Pledges. The remaining 
33 men and 42 women were designated as Rushees- 
No Bid. The apparent differences between men and 
women in rushing and pledging have been typically 
true in this college, reflecting the fact that the 
fraternities are national organizations but the soror- 
ities are local. It also is due to limitation of the 
size of sororities; that is, these groups are limited 
to a small number of pledges each year, whereas the 
fraternities are not so limited. 


RESULTS 


The mean ARS scores for the 382 subjects were 
computed to be 55.24, with the SD being 9.7 
For men, N = 219, the mean was 55.82 and the 
SD was 9.48: for women, N = 163, the mean 
was 54.45, the SD was 9.96. This difference be- 
tween means is not significant (CR = 1.36), nor 
are any of the apparent differences between sexes 
presented for each subgroup in the table below. 

Table 1 presents the data relevant to the first 
hypothesis. It can be seen that male rushees 
have significantly higher ARS scores than do non- 


TABLE 1 


Mean ARS Scores or RUSHEE 
AND NONRUSHEE GROUPS 


Rushees Nonrushees 
fer- | 
| ence 
N SD 


115} 9.10 3.47 
91 |10.30 1.66 


Men 
Women 


%e>. 
o< 
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rABLE 2 


MeAN ARS Scores OF PLEDGES AND 
RusHee-No Bip Groups 


Pledges Rushee-N 


M N SD Mf 


995 
8.95 


54.70 
54.02 


Men 59.01 | 71 
Women| 57.27 | 30 


*p > .10 
ord < 02 


rushee males, but this difference does not hold 
for women. Similarly, Table 2 shows that of those 
men who rush, those who receive bids have sig- 
nificantly higher ARS scores than those who do 
not pledge. Once again, this difference is not 
true for women. 

It will be noticed that the men who do not 
receive bids have a mean ARS score (54.70) very 
similar to that of the men who do not rush 
(54.19), which would appear to further support 
the implicit hypothesis that fraternity men are 
different from nonfraternity men. The same rela- 
tive similarity is true for women: those who do 
not receive bids have a mean score (54.02) close 
to that of those women who do not rush (53.72). 
Table 3 shows that when the differences between 
pledges and all those who do not pledge are 
tested, it is found that the difference is signifi- 
cant for men, but for women it just misses sig- 
nificance at the .05 level 

It appears that the predicted results hold true 
consistently for men, but not for women, even 
though there are no significant differences in ARS 
scores between the sexes. Inasmuch as Couch and 
Keniston (1960) originally standardized the ARS 
scale on men only, it may be that the scale con- 
tains a few items which are more differentiating 
for men than for women. The fact that the scale 
may be less differentiating for women than for 
men should not detract from the general verifi- 


rABLE 3 


ARS Scores or Pie 
ALL NONPLEDGES 


MEAN 


Mf \ SD 


8.80) 4.72 |3.73** 
1.84" 


Men 59.01) 71 | 9.95 | 54.29 148 
5 


? 
8.95 53.81) 133 |10.06) 3.46 


7 
Women | 57.27) 30 


*» < 


ee» « 


cation of the hypotheses about the differences 
between externally and internally oriented peo- 
ple. Further study of the applicability to women 
of the concept of an agreeing response tendency 
as an underlying personality syndrome seems t 
be in order.’ 


SUMMARY 


Couch and Keniston (1960) described a per 
sonality syndrome which seemed to fit members 
of fraternities and sororities, as opposed to col- 
lege men and who do not join such 
groups. Predictions made as to the ARS 
scores of those who will pledge these groups 
those who rush but are not pledged, and those 
who do not rush. These predictions were sup- 
ported for men but not for women 


women 


were 


‘In a separate study done subsequent to this one, 
Lucy Beebe found a mean ARS score for 79 non 
sorority women of 53.15, and a mean score of 57 
for 89 sorority women, which produced a significant 
difference at the .05 level. The subjects in this study 
contained no freshmen, so there was no 
between the subjects in the two studies 
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ove rlap 
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ERRATUM 

On page 637 of the article “Pancultural Factor Analysis of Reported Sociali- 
zation Practices,” by Leigh Minturn Triandis and William W. Lambert (J. 
abnorm., soc. Psychol., 1961, 62, 631-639), the authors found that the scoring of 
sex of child variable is such that this factor actually reflects mothers who permit 
their girls to be aggressive. Thus the word boy should be replaced by girl in the 
description of Factor 9. The correct wording is: 
(Scale 2, +.48, reflecting the “degree to which mother is positive when child fights other 


children”), tied somewhat to girls . . . which gives a picture of an expressive, positive 
mother, who permits her girl to be expressive as well 
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